Running with Code Like with scissors, only more dangerous

14Feb/080

Comment Responses: C# 3.0 Best Practices

I’ve received some comments and composed some responses, included below:

First Comment/Question:

I’ve got questions about performance in LINQ and LINQ to SQL. 
Is it more effective to create a query against the context or against a collection on another object?  For example, which is better:

     = from t in db.Things
        where t.Something = "MyVal"
        && t.ForeignKeyId = 28

or is it better to do this:

     = from t in db.ForeignKeys
        where t.Something = "MyVal"

?  In the former, I’m running against the data context, in the latter, the (kinda) array of foreign key matches

Is it more effective to select the object or a property in the object if I am using an agregate function?  For example:

    = ( from t in db.Things select t ).Count()

or is it better to do this:

    = ( from t in db.Things select t.ThingId ).Count()

?  In the former, I select the entire t but do nothing.  Does it actually query the data, or just a place-holder?  In the latter, I’ve got a specific property, which I still don’t need.

My answer:

I’m going to preface this by saying – as far as LINQ goes, there’s SO MUCH in the toolset, and I can’t claim to be much more than a novice.  What I can say about it, though, is that I can make some educated guesses about how LINQ will behave vs. LINQ-to-SQL.

In your first example, with the query against the database, looking at the "where t.Something == "MyVal" && t.ForeignKeyId == 28" (you’re using double-equals-signs in those, right? ;-)), you’re probably better-off doing the complete query inside of the where clause.  Especially in LINQ-to-SQL, where that where clause is going to end up in a SQL statement anyway, you’re going to be pulling against a database column, which should be fairly fast.  The only performance tweak I could suggest in this instance is ordering the foreign key ID comparison first, to avoid the char-by-char comparison of a string compare if the foreign key IDs don’t match up.  This is kind of a difficult choice to make, as it might sacrifice a magnitude of code readability ("Why is Rob checking a group of values for their foreign key ID?"). 

In your second question related to the Count() aggregate, your LINQ-to-SQL and LINQ-to-Objects queries will (I believe) have similar performance characteristics, but for different reasons.

If you’re using LINQ-to-SQL, the entire query should be ported to SQL and should end up reading like:

SELECT COUNT(*) FROM db.Things;

The alternative (in the alternative case you suggested) would be:

SELECT COUNT(ThingId) FROM db.Things;

I don’t recall if COUNT counts NULL rows, and so you have a potential for actual incorrect data if you use the latter, though if you’re sure to go against something like the PK column you’re probably allright.

In any case, I don’t believe it really matters.  When using LINQ-to-SQL, you’ll pull from the DB.  I just pulled up a SQL Profiler, ran a query from LINQ, and I got the results I expected:

SELECT COUNT(*) AS [value] FROM [Events].[Events] AS [t0]

So it looks like we’re good.  Unfortunately this query can’t be visualized like non-aggregate queries:

var result = (from ev in context.Events select ev).Count();

Hovering over "result" just results in a 0 being displayed.

As for LINQ-to-Objects, my personal opinion is to use the first syntax (select t) as opposed to creating a new anonymous type.  The reason for this is that a new series of objects needs to be created, even if it just contains your one property. 

One of the things we’re doing in 3.5 is to extend our existing object model, which generally shields us from database-specific implementation code, to support objects like data contexts.  However, any code generated by LINQ-to-SQL will ONLY be in our provider-specific implementation libraries that are dynamically bound at runtime.  This allows us to program with LINQ-to-SQL when we want to, but it requires that we map .NET Entity objects to our custom entity objects.  Still, this is sometimes beneficial.

Second Comment/Question:

I don’t understand this discussion on Object Initializers:

"Always validate input to properties; an exception raised while setting a property will cause a memory leak, even inside try/catch

The object will already be allocated if an exception is raised within a property"

I am not sure how to express what it is that I do not understand, so maybe you can just talk to this a little bit. Typically, I would not validate parameters in an object initializer scenerio (unless they came from an external source, of course, such as user input). For example, if I pull data out of a database (my database) and build an object from it, I would not validate the data first – an exception seems correct in this case (corrupt database data). I also would not fear "memory leaks" because I assume that my "partially-constructed" object would be garbage collected like any properly-constructed object would.

In other words, I image that using object initializers is *equivalent* to (a) creating an object using the default contructor, followed by (b) setting properties. If an exception occurred setting properties – so what? [At least from "memory leak" point of view.]

My answer:

If you’re populating from a database, your assumption is that the input is going to be valid, and that’s certainly a valid point.  That you’re thinking about this is really what I was after.

The "memory leak" I’m referring to is, again, a potential non-issue as you pointed out (the stranded object will indeed get garbage-collected).  The problem is, for me anyway, that we have no idea and no control over *when* that object gets garbage-collected, or how.  In fact, because it’s implicitly compiler-created, we have no way to do anything at all with that temporary object.

Suppose we had this code:

   1:  SomeObj blah;
   2:  try {
   3:      blah = new SomeObj() { ExceptionTest = "This text generates an exception." };
   4:  } catch { }

Even here, the "<>blah_temp01" object (or whatever it’s called) is still valid on the stack, but we have no way to access it.  The benefit is that code still behaves as if it was part of a constructor, and the "blah" variable is null (which is my guess of what the developers were gunning for).  But until we run out of memory and do a GC, that memory’s still there, still allocated, and possibly causing heap fragmentation.

The other difficulty we run into, and this is perhaps a cheap excuse, is that when all of the properties are set on the same line, we can’t tell (at least in Visual Studio) which line is causing the exception.  We could do this:

   1:  blah = new SomeObj() {
   2:                          ExceptionTest = "This text generates an exception."
   3:                       };

But does that really gain you much over:

   1:  blah = new SomeObj();
   2:  blah.ExceptionTest = "This text generates an exception.";

Ultimately, my point is this: you can handle cleaning up, disposing, and you otherwise know that an object exists when you don’t use initializers.  You don’t when you don’t use initializers.  Suppose I have a smart card file stream that implements IDisposable.

Stream fs = new SmartCardFileStream() { FilePath = "003F/0040", Mode = FileMode.ReadWrite };

Suppose that, for whichever reason, the file opens when you set FilePath.  But, FileMode.ReadWrite isn’t supported (only read or write).  You can’t programmatically close that file; you have to wait for the finalizer to be invoked.  Arguably, you should wrap that declaration within a using {} block, but that’s not always feasible (e.g., when you need to deal with it across events in a Winforms app). 

The primary difference is that, when an exception is raised in a constructor, it is automatically marked for garbage collection by the CLR (except in cases of a static constructor raising a TypeInitializationException, making the whole class inaccessible for the duration).  A partially-"constructed" class that raised an exception during a property-set would be problematic in that it doesn’t get that benefit – we have to rely on the garbage collector to determine that there are no outstanding references to it at the next GC pass.

Summary

I chopped off some of the thank yous/hellos/I appreciates from the e-mails, and just wanted to once again say that I appreciate the comments and feedback.  Hopefully, those of you who chose to comment weren’t too thrown off by my e-mail replies; I’m not particularly a fan of starting threads and posting comments on my own blog post (although I may once I get a new blog).  Also – I’d just like to point out that these are my personal opinions and not necessarily the end-all, be-all of C#; there are certainly places to use each new feature, and my goal is to investigate how features are implemented so that we know when a situation calls for them.

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

ERROR: si-captcha.php plugin says GD image support not detected in PHP!

Contact your web host and ask them why GD image support is not enabled for PHP.

ERROR: si-captcha.php plugin says imagepng function not detected in PHP!

Contact your web host and ask them why imagepng function is not enabled for PHP.

No trackbacks yet.