Designing an airline passenger reservation system

25 November 2008

Architecture - Linq-to-SQL and set-based operations: Delete statements

Filed under: Architecture — Tags: , , , , , , — Kristofer @ 21:42

A few weeks back I published a code sample showing an implementation of set-based update operations using Linq-to-SQL. I also promised a followup showing batch deletes, and “insert into…select from” so I guess it is time for a followup. This time I will cover set-based deletes.

To avoid unnecessary duplication the code in this article is an incremental update to the code in my previous article on the subject.

The original set-based update-statement implementation is here:
http://blog.huagati.com/res/index.php/2008/11/05/architecture-linq-to-sql-and-set-based-operations-update-statements/

On a semi-related note: a couple of days ago, Microsoft’s Bart de Smet (author of the B# blog) also recognized the need for language integrated set-based operations in his “Dude, where’s my LINQ DML” blog entry. Although his article left out the actual implementation - he’s instead elaborating on a generic interface to front this functionality. (Although I have to say that I think my implementation results in more clean end-user code than his suggested interface. I know I’m abusing lambdas for the “set”-part but I think it is clear enough anyway…).

Set-based delete statements

I’m not going to elaborate on the rationale for set-based deletes rather than a row-based approach. The reasons for set-based/batch deletes are exactly the same as set-based/batched updates, already covered in the previous article.

Similar to the update statements I will have a couple of overloaded functions allowing for some flexibility in how it is used. The simple form takes a type and a criteria lambda:


dc.Delete<Employee>(
    (where => where.City.StartsWith("XYZ"))
    );

The SQL generated by this statement is:

delete from dbo.Employees  where EmployeeID in (SELECT [t0].[EmployeeID]
FROM [dbo].[Employees] AS [t0]
WHERE [t0].[City] LIKE @p0)

Just like the update-statement version, we can use navigation properties, CLR functions etc to set up more complex expressions in a single lambda:


dc.Delete<Customer>(
    where => where.Orders.Max(
        od => od.OrderDate
        ) < DateTime.Now.AddYears(-5)
    && where.Country == "Ukraine"
    );

 …and because we’re using Linq-to-SQL to generate the SQL, the where clause lambda is translated into nice SQL. After concatenation with the delete part of the statement the line above translates into:

delete from dbo.Customers  where CustomerID in (SELECT [t0].[CustomerID]
FROM [dbo].[Customers] AS [t0]
WHERE (((
    SELECT MAX([t1].[OrderDate])
    FROM [dbo].[Orders] AS [t1]
    WHERE [t1].[CustomerID] = [t0].[CustomerID]
    )) < @p0) AND ([t0].[Country] = @p1))

 If we need a more complex filter than what a lambda expression on the existing L2S classes allow, or if we want a more readable lookup query as the base there’s also an overload that accept an IQueryable query:



IQueryable<Customer> customersOfKievRep =
    from cust in dc.Customers
    join ord in dc.Orders on cust.CustomerID equals ord.CustomerID
    join emp in dc.Employees on ord.EmployeeID equals emp.EmployeeID
    where emp.Region == "Kiev"
    select cust;
dc.Delete<Customer>(customersOfKievRep, LockModeEnum.Row); 

The last example also shows the lock mode parameter in use, so the generated SQL will be:

delete from dbo.Customers with (rowlock) where CustomerID in (SELECT [t0].[CustomerID]
FROM [dbo].[Customers] AS [t0]
INNER JOIN [dbo].[Orders] AS [t1] ON [t0].[CustomerID] = [t1].[CustomerID]
INNER JOIN [dbo].[Employees] AS [t2] ON [t1].[EmployeeID] = ([t2].[EmployeeID])
WHERE [t2].[Region] = @p0)

 

So. Enough usage examples, on to the implementation.

Implementation

Here’s the implementation. Add it to the code sample from the previous article and you’re ready to use it. As with the previous example, feel free to use it in your own L2S-based software. Remember that you use it at your own risk, don’t blame me if it breaks anything. If you want to translate/reuse any part with LLBLGen, subsonic, NHibernate etc that’s fine too. The only thing not allowed is to convert it for use with Entity Framework, or to re-publish in blog articles without prior written permission. :)


public static void Delete<T>(this DataContext dc, Expression<Func<T, bool>> selectionCriteria)
{
    IQueryable<T> baseQuery = ((IQueryable<T>)dc.GetTable(typeof(T))).Where(selectionCriteria);
    Delete<T>(dc, baseQuery, LockModeEnum.NotSpecified);
}

public static void Delete<T>(this DataContext dc, IQueryable<T> baseQuery, LockModeEnum lockMode)
{
    //get the type representing the table we're dealing with
    Type baseType = typeof(T);

    //get primary key members for the type we're dealing with
    List<MetaDataMember> primaryKeyMembers = GetPKMembers(dc, baseType);
    if (primaryKeyMembers.Count == 0)
    {
        throw new InvalidOperationException(baseType.Name + " has no primary key members.");
    }

    //create a new class containing only the PK members of the table we're going to update
    MemberBinding[] memberBindings = null;
    ParameterExpression[] baseTypeParams = new ParameterExpression[] { Expression.Parameter(baseType, "") };
    Type pkType = CreatePKClass(baseType, primaryKeyMembers, baseTypeParams[0], out memberBindings);

    //change the query to only get the PK members
    IQueryable pkQuery = GetPKQuery(baseQuery, pkType, baseTypeParams, memberBindings);

    //extract the select query for the primary key members
    DbCommand pkSelectCommand = dc.GetCommand(pkQuery);
    string whereClause = null;
    if (primaryKeyMembers.Count == 1)
    {
        whereClause = "where " + primaryKeyMembers[0].Name + " in (" + pkSelectCommand.CommandText + ")";
    }
    else
    {
        string ct = pkSelectCommand.CommandText;
        whereClause = ct.Substring(ct.IndexOf("\r\nfrom ", StringComparison.InvariantCultureIgnoreCase));
    }

    //get the table name
    string qualifiedTableName = dc.Mapping.GetTable(baseType).TableName;

    //concatenate the delete statement
    string deleteStatement = ReplaceSQLParameters(
        "delete from " + qualifiedTableName + " " + GetLockHint(lockMode)
        + " " + whereClause);

    //execute the delete
    dc.ExecuteCommand(deleteStatement, MergeParameters(pkSelectCommand.Parameters));
}

21 November 2008

Tools - Part 6 - Add-ins - “Model First” in Linq-to-SQL (and Entity Framework)

Filed under: Tools — Tags: , , , , , , — Kristofer @ 15:47

A question frequently asked on community sites such as stackoverflow.com and in the MSDN forums is “how do I generate DDL scripts with model changes“. Translated: some people want to use a model first approach with the Linq-to-SQL designer or the Entity Framework designer as the modelling tool.

This is a nice idea that I absolutely subscribe too. However, a modelling tool should be able to accurately describe the entire (as in all aspects of) the model so using either of those designers to model a database would result in schemas with no indexes besides primary keys etc. Support for incremental changes is also a pretty basic requirement in my opinion.

“Model First” in Entity Framework

A while back, the EF team published an article in their EFDesign blog where they described the approach they are taking in the next version of the EF designer, slated for release sometime around 2010 if I have interpreted everything correctly. They will add a new feature that generates the SSDL description of the storage layer automatically from the conceptual layer and then generate SQL DDL from that. Effectively they are throwing away the advantages that a layered model brings, and turning it into a 1:1 model no different from L2S models. Or in the words of Microsoft PM Noam Ben Ami: “We would like users to understand that this feature will regenerate all SSDL and all MSL from scratch.

To add insult to injury, they won’t even support generating diff-scripts, e.g. if you add “EmailAddress” to your employee table they will re-generate the entire database rather than just issue a simple “alter table employee add email_address nvarchar(100);” statement. Again in the words of Microsoft’s “Noam Ben Ami” in the efdesign blog article: “your database will be recreated from scratch“.

This “regenerate the database” feature was demonstrated by another Microsoft PM, Tim Mallalieu, at PDC during his “Entity Framework Futures” presentation. (41:00 until 47:00). He also explicitly pointed out that the database will be dropped and re-created instead of amended with the changes (45:55 - 46:15 into the presentation) and added a gleeful “I told you so, when your data is gone“. Watch it here: http://channel9.msdn.com/pdc2008/TL20/

Not to mention indexes, alias types, views, etc etc. Belongs in the model in my opinion, does not according to Microsoft’s EF team / DP group.

“Model First” in Linq to SQL

Linq-to-SQL is not really designed for a model-first approach either. Unlike EFv1 however it has the ability to generate a database from scratch. Still no support for indexes etc in the model though, but it is nice to get a baseline database (tables, PKs, FKs etc) generated.

A lot of development time involves the usual changes to models and databases. Add a field here, drop a field there. Add a couple of new tables, a couple of new foreign keys etc. Who want to throw away the database and start with a new blank database every time some small change is made to the model? Not me. So I normally model my databases using modelling tools (currently Visio for Enterprise Architects 2003) that support forward/reverse engineering and change scripts, and then update the L2S model to correspond with the db schema. Supporting that scenario is the core reason why I wrote the “sync model” feature in Huagati DBML/EDMX Tools in the first place; I was bored with having to apply the same changes to more than one model.

Now I want to simplify that by generating basic change scripts; added columns, added tables, dropped tables, dropped/added foreign keys, etc as SQL DDL scripts. So I added that feature to the add-in. It is available in version 1.47 that was just released and is available for download now.

Demo

Allow me a quick demonstration using the Microsoft Northwind database:

First I open up my existing Northwind DBML test project that contain the baseline Northwind model. I add a new member/column EmployeeEmail to the Employee entity/table, two new entities/tables “EmployeeFamily” and “EmployeeRelativeKind” and associations between Employee/EmployeeFamily and EmployeeFamily/EmployeeRelativeKind.

NorthWind with a couple of new entities/tables, associations/FKs, and members/columns

Next, I pop over to the DBML/EDMX Tools menu in Visual Studio and select the “Generate DDL with Model <-> db differences” option.

Huagati DBML/EDMX Tools menu in Visual Studio 2008

And when the add-in has completed comparing the model to the underlying database it pops up a new SQL-DDL change script outlining the differences between the model and the database.

DDL diff script generated by Huagati DBML Tools

 

So, there you have it. This is available today for Linq-to-SQL. It generates change scripts with just the differences instead of “blowing away the database” (to paraphrase Tim Mallalieu). Huagati DBML Tools now support both forward and reverse updating between model and database.

I may even add the same feature for Entity Framework if there is enough demand for it so people can reverse/forward update their models without blowing away the existing SSDL/MSL and without dropping the entire database.

 

Ps. Dear Microsoft: if you’re interested in providing these features for L2S and/or EF to your customers so they don’t have to buy the add-in from me - you got my phone number. :)

5 November 2008

Architecture - Linq-to-SQL and set-based operations: Update statements

Filed under: Architecture — Tags: , , , , , , — Kristofer @ 15:59

Data manipulation with OR mappers usually ends up with records being to be pulled back from the database to the business logic tier or client, changes applied and record-based update statements generated by the OR mapper and sent back to the db server. This is fine when working with a small number of records or when the data is needed in a higher tier for the update to take place (e.g. a user editing a customer record).

However, there are situations where this is a less-than-ideal way of working with the data. When a large number of records need to be updated, e.g. all products in a category discontinued, seat “2A-2C” blocked on all future flights etc the data is usually not needed in the application layers. Instead, set based operations done directly in the database are the most efficient way to do these operations. Many ORMs, including Linq-to-SQL and Entity Framework do not support set-based operations so set-based operations instead result in the above mentioned record-based operations with ensuing unnecessary network traffic, extra work for the database server etc.

When using Linq-to-SQL out of the box there are two ways to achieve set-based operations where an update, delete or “insert into … select … from …” statement will be executed on the database without actually returning any data:

  • Stored procedures; the set based update/delete/insert is added to a stored procedure, the stored procedure is in turn added to the model and called as a method on the data context.
  • SQL-in-strings; the good old method of concatenating a SQL statement in a string and running it using ExecuteCommand.

Both methods work fine, but the drawback is that the language integration and LINQ flexibility is lost when using either of the methods.

Wouldn’t it be neat with an extension allowing set-based updates, deletes etc based on the L2S-generated entity classes?

I’m thinking something along the lines of this. I’m going to use Northwind as the example here instead of the res system db just to make the examples more simple to follow for everyone familiar with Northwind. Say we want to discontinue all products in a product category:

using (NorthwindDBML.DataClasses1DataContext dc = new DataClasses1DataContext())
{
    dc.Update<Product>(
        (where => where.Category.CategoryID == 1),
        (set => set.Discontinued == true)
        );
}

The first lambda expression passed to the Update method is the where clause for the update statement, and the second lambda defines the updates that will take place. The final result should be a SQL update statement along the lines of:

update dbo.Products  set [Discontinued] = 1 where ProductID in (SELECT [t0].[ProductID]
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN [dbo].[Categories] AS [t1] ON [t1].[CategoryID] = [t0].[CategoryID]
WHERE [t1].[CategoryID] = @p0)

Another example, say we want to add the IDD prefix “+49″ in front of the phone number for all German customers if they don’t already have the it. That would be expressed as:

dc.Update<Customer>(
    (where => where.Country == "Germany" && !where.Phone.StartsWith("+49")),
    (set => set.Phone == "+49" + set.Phone),
    LockModeEnum.Row);

…and would result in a SQL update statement like this:

update dbo.Customers with (rowlock) set [Phone] = (@p0 + [Phone]) where CustomerID in (SELECT [t0].[CustomerID]
FROM [dbo].[Customers] AS [t0]
WHERE ([t0].[Country] = @p1) AND (NOT ([t0].[Phone] LIKE @p2)))

To make it play nice, it takes a parameter for locking hints too. Isn’t that neat?

As already established, this is not supported in Linq-to-SQL out of the box. (and not supported in entity framework at all). So do we have to wait for Microsoft to maybe add that in a future version? No, won’t work. They already said that entity framework is their “preferred” data access technology despite L2S being more popular among their user base than EF so I don’t think they will spend time and money on implementing something like this.

Instead, I played around a bit and came up with the following. This is just a first version so there are probably bugs in it. In addition, if you have tables using reserved keywords such as “from”, “where” etc as the table-name or in column names then it will break. It may also break when used with future versions of L2S (if there will be any). That said, here’s the first implementation of set-based language integrated update statements using Linq-to-SQL data contexts and Linq-to-SQL generated classes.  (delete, “insert into … select … from …” will come in a future version).

Enjoy:

//Linq-to-SQL set-based operations. Copyright (c) 2008, Huagati Systems Co.,Ltd. ( www.huagati.com )
//DISCLAIMER: Use at your own risk, this is a code sample provided as-is with no warranties. By using the code below you take responsibility for any damage that may occur as a result.
//You are not allowed re-publish this code sample (including but not limited to articles, blogs, source code libraries etc) without prior written consent.
//You may however include it in in your own software provided that this header remains intact in the source code,
//  and that the main purpose or usage area of that software is not to add functionality to Linq-to-SQL.
//You may not modify it for usage with ADO.NET Entity Framework without prior written permission. If you do, your license to use it is automatically revoked.
using System;
using System.Linq;
using System.Data.Linq;
using System.Data.Linq.Mapping;
using System.Linq.Expressions;
using System.Collections.Generic;
using System.Reflection;
using System.Reflection.Emit;
using System.Text.RegularExpressions;
namespace System.Data.Linq.SetBased
{
    public enum LockModeEnum
    {
        NotSpecified,
        Row,
        Page,
        Table,
        TableExclusive,
        NoWait
    }
    public static class LinqSetbasedExtensions
    {
        public static void Update<T>(this DataContext dc, Expression<Func<T, bool>> selectionCriteria, Expression<Func<T, bool>> updateOperation)
        {
            Update<T>(dc, selectionCriteria, updateOperation, LockModeEnum.NotSpecified);
        }
        public static void Update<T>(this DataContext dc, Expression<Func<T, bool>> selectionCriteria, Expression<Func<T, bool>> updateOperation, LockModeEnum lockMode)
        {
            IQueryable<T> baseQuery = ((IQueryable<T>)dc.GetTable(typeof(T))).Where(selectionCriteria);
            Update<T>(dc, baseQuery, updateOperation, lockMode);
        }
        public static void Update<T>(this DataContext dc, IQueryable<T> baseQuery, Expression<Func<T, bool>> updateOperation, LockModeEnum lockMode)
        {
            //get the type representing the table we're dealing with
            Type baseType = typeof(T);
            //get primary key members for the type we're dealing with
            List<MetaDataMember> primaryKeyMembers = GetPKMembers(dc, baseType);
            if (primaryKeyMembers.Count == 0)
            {
                throw new InvalidOperationException(baseType.Name + " has no primary key members.");
            }
            //create a new class containing only the PK members of the table we're going to update
            MemberBinding[] memberBindings = null;
            ParameterExpression[] baseTypeParams = new ParameterExpression[] { Expression.Parameter(typeof(T), "") };
            Type pkType = CreatePKClass(typeof(T), primaryKeyMembers, baseTypeParams[0], out memberBindings);
            //change the query to only get the PK members
            IQueryable pkQuery = GetPKQuery(baseQuery, pkType, baseTypeParams, memberBindings);
            //extract the select query for the primary key members
            System.Data.Common.DbCommand pkSelectCommand = dc.GetCommand(pkQuery);
            string whereClause = null;
            if (primaryKeyMembers.Count == 1)
            {
                whereClause = "where " + primaryKeyMembers[0].Name + " in (" + pkSelectCommand.CommandText + ")";
            }
            else
            {
                string ct = pkSelectCommand.CommandText;
                whereClause = ct.Substring(ct.IndexOf("\r\nfrom ", StringComparison.InvariantCultureIgnoreCase));
            }
            //generate a select query with the update expression as the where clause
            IQueryable<T> updateOPQuery = ((IQueryable<T>)dc.GetTable(typeof(T)).AsQueryable());
            updateOPQuery = updateOPQuery.Where(updateOperation);
            //get the table name
            string qualifiedTableName = dc.Mapping.GetTable(typeof(T)).TableName;
            //extract the where clause from the update operation query
            System.Data.Common.DbParameterCollection setClauseParameters;
            string updateClause = GetSetClauseFromWhere(dc.GetCommand(updateOPQuery), qualifiedTableName, out setClauseParameters);
            //concatenate the update statement
            string updateStatement = ReplaceSQLParameters(
                "update " + qualifiedTableName + " " + GetLockHint(lockMode)
                + " set " + updateClause
                + " " + whereClause);
            //merge parameter groups into a single array
            object[] parameters = MergeParameters(setClauseParameters, pkSelectCommand.Parameters);
            //excute the update
            dc.ExecuteCommand(updateStatement, parameters);
        }
        private static List<MetaDataMember> GetPKMembers(DataContext dc, Type baseType)
        {
            //get primary key members for a L2S class/type
            return
                (
                    from pkMember in dc.Mapping.GetTable(baseType).RowType.DataMembers
                    where pkMember.IsPrimaryKey == true
                    select pkMember
                ).ToList<MetaDataMember>();
        }
        private static IQueryable GetPKQuery(IQueryable baseQuery, Type pkType, ParameterExpression[] baseTypeParams, MemberBinding[] memberBindings)
        {
            //generate a lambda for creating a new instance of the PK type, mapped to the base type for the table
            LambdaExpression lambda
                = Expression.Lambda(
                    Expression.MemberInit(
                        Expression.New(pkType),
                        memberBindings),
                    baseTypeParams
                  );
            //create a new query returning only the PK members as instances of the new pk type
            return
                baseQuery.Provider.CreateQuery(
                    Expression.Call(typeof(Queryable),
                            "Select",
                            new Type[] { baseQuery.ElementType, lambda.Body.Type },
                            baseQuery.Expression,
                            Expression.Quote(lambda)
                    )
                );
        }
        private static Type CreatePKClass(Type baseType, List<MetaDataMember> primaryKeyMembers, ParameterExpression baseTypeParam, out MemberBinding[] memberBindings)
        {
            //generate a new assembly and class
            AssemblyBuilder assembly
                = AppDomain.CurrentDomain.DefineDynamicAssembly(
                    new AssemblyName("PKClasses"), AssemblyBuilderAccess.Run);
            ModuleBuilder moduleBuilder = assembly.DefineDynamicModule("PKModule");
            TypeBuilder typeBuilder = moduleBuilder.DefineType("PKClass_" + baseType.Name, TypeAttributes.Class | TypeAttributes.Public);
            //add the primary key members to the new type
            foreach (MetaDataMember pkMember in primaryKeyMembers)
            {
                string memberName = pkMember.Name;
                Type memberType = pkMember.Type;
                //create a private field for storage
                FieldBuilder storageField = typeBuilder.DefineField("_" + memberName, memberType, FieldAttributes.Private);
Â
                //add a get method, returning the storage field
                MethodBuilder propertyGet
                    = typeBuilder.DefineMethod("get_" + memberName,
                        MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig,
                        memberType, Type.EmptyTypes);
                ILGenerator getIL = propertyGet.GetILGenerator();
                getIL.Emit(OpCodes.Ldarg_0);
                getIL.Emit(OpCodes.Ldfld, storageField);
                getIL.Emit(OpCodes.Ret);
Â
                //add a set method, setting the storage field
                MethodBuilder propertySet
                    = typeBuilder.DefineMethod("set_" + memberName,
                        MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig,
                        null, new Type[] { memberType });
                ILGenerator setIL = propertySet.GetILGenerator();
                setIL.Emit(OpCodes.Ldarg_0);
                setIL.Emit(OpCodes.Ldarg_1);
                setIL.Emit(OpCodes.Stfld, storageField);
                setIL.Emit(OpCodes.Ret);
                //create a public property for the member
                PropertyBuilder memberProperty = typeBuilder.DefineProperty(memberName, System.Reflection.PropertyAttributes.HasDefault, memberType, null);
                //assign the get/set methods to the property
                memberProperty.SetGetMethod(propertyGet);
                memberProperty.SetSetMethod(propertySet);
            }
            //return the type of the generated class
            Type pkClass = typeBuilder.CreateType();
            //get member bindings mapping the members of the pk class to the corresponding members in the base class
            //(out param)
            memberBindings =
                (
                from pkMember in primaryKeyMembers
                select (MemberBinding)Expression.Bind(pkClass.GetProperty(pkMember.Name),
                            Expression.Property(baseTypeParam, baseType.GetProperty(pkMember.Name)))
                ).ToArray<MemberBinding>();
            //return the new class' type
            return pkClass;
        }
        private static string GetLockHint(LockModeEnum lockMode)
        {
            //return a lock hint based on the specified lock mode
            switch (lockMode)
            {
                case LockModeEnum.NotSpecified:
                    return "";
                case LockModeEnum.Row:
                    return "with (rowlock)";
                case LockModeEnum.Table:
                    return "with (tablock)";
                case LockModeEnum.TableExclusive:
                    return "with (tablockx)";
                case LockModeEnum.Page:
                    return "with (paglock)";
                case LockModeEnum.NoWait:
                    return "with (nowait)";
                default:
                    return "";
            }
        }
        private static string GetSetClauseFromWhere(System.Data.Common.DbCommand updateOperationCommand, string qualifiedTableName, out System.Data.Common.DbParameterCollection commandParameters)
        {
            //get just the criteria part of the query
            string commandText = updateOperationCommand.CommandText;
            commandText = commandText.Substring(commandText.IndexOf(qualifiedTableName) + qualifiedTableName.Length);
            commandText = commandText.Substring(commandText.IndexOf("where", StringComparison.InvariantCultureIgnoreCase) + 5);
            //new stringbuilder for generating the set clause
            System.Text.StringBuilder ctBuilder = new System.Text.StringBuilder();
            //replace 'and' and 'or' with comma separators
            string[] separators = { " and ", " or ", " AND ", " OR "};
            foreach (string statementPart in commandText.Split(separators, StringSplitOptions.None))
            {
                //remove unnecessary parenthesis wrappers
                string sp = statementPart.Trim();
                if (sp.StartsWith("(") && sp.EndsWith(")"))
                {
                    sp = sp.Substring(1, sp.Length - 2);
                }
                //remove "[t0]." table alias
                sp = sp.Replace("[t0].", "");
                //add to the string builder
                ctBuilder.Append(sp);
                ctBuilder.AppendLine(", ");
            }
            ctBuilder.Remove(ctBuilder.Length - 4, 4);
            //return the command parameters in the out param variable
            commandParameters = updateOperationCommand.Parameters;
            //return the where-clause turned into a set clause
            return ctBuilder.ToString();
        }
        private static string ReplaceSQLParameters(string sqlCommand)
        {
            //replace @0, @1, etc with {0}, {1} etc...
            Regex paramMatching = new Regex(@"@p\d", RegexOptions.Compiled);
            int paramNo = 0;
            while (paramMatching.IsMatch(sqlCommand))
            {
                sqlCommand = paramMatching.Replace(sqlCommand, "{" + paramNo.ToString() + "}", 1);
                paramNo++;
            }
            return sqlCommand;
        }
        private static object[] MergeParameters(params System.Data.Common.DbParameterCollection[] parameterGroups)
        {
            //merge parameter groups into a one-dimensional object array. (nested loops as the dbparametercollection don't implement IEnumerable)
            List<object> parameters = new List<object>();
            foreach (System.Data.Common.DbParameterCollection parameterGroup in parameterGroups)
            {
                foreach (System.Data.Common.DbParameter parameter in parameterGroup)
                {
                    parameters.Add(parameter.Value);
                }
            }
            return parameters.ToArray();
        }
    }
}

3 November 2008

Quiet blog, gone fishing, is Linq-to-SQL dead?, etc

Filed under: off-topic — Tags: , , , , , , — Kristofer @ 1:48

It has been a while since I posted anything on the blog. Did I stop blogging? Nope, just been having fun with other stuff. Like optimizing indexes for an oracle database, adding new features to my L2S/EF toolkit etc.

Gone Fishing

Right now I am “busy” fishing. It is Sunday and we’re at our vacation house on the countryside 130km to the west of Bangkok, where we have a small lake full of fish. I have been trying to catch a fish since yesterday but so far it is 2-0 to the fish. Over night, two tackles and hooks just disappeared. One large triple pointed hook attached to a steel wire and baited with a chicken skin is gone, the steel wire had snapped or been cut off. So I either got a very large fish with razor sharp teeth hanging out in the lake, a fish with a wire cutter, or maybe our resident monitor lizard got it.

The second missing tackle is also a triple pointed hook attached to half litre water bottle. I use water bottles as floats as they allow the fish to swim around while making it easy for me to find them. Until now. I have checked everywhere, and it is just gone. Not in the lake, and not anywhere in the irrigation canals out in the back of the garden. If the monitor lizard got it I would have expected to find it outside of her nest but no trace there either. (I know it’s a ’she’ because she laid eggs a few months ago. Unfortunately someone poached the nest a couple of weeks later so we won’t get any cute baby monitors. Monitor lizard eggs are turned into “performance enhancing” omelet by the locals.)

Some time whatever animal got it will have to surface, no fish can stay under water forever with an extra half kilo bouyancy. One remote possibility is that I got a fish on it and the fish got eaten by something larger. Like a python. They’re not resident (they’re more like consultants :) ) so if a python got it, it has probably already left our garden. If anyone comes across a large python dragging around a Minere bottle attached to a string, please return it to me. :)

 

EF performance

Enough off-topic stuff. I promised a while ago that I would write something about Microsoft’s new ADO.NET Entity Framework and performance. I was planning to do that, but I feel it is no longer necessary. There has been a lot written about it already and it isn’t pretty. Somehow they forgot to test EF and L2E beyond “Hello World”. Complex LINQ-to-entities queries are translated into really poor SQL. Less complex queries like a simple “from person p in oc.Persons where p.LastName.StartsWith(lastName) select p” are translated into equally bad SQL that will always force a table scan. Basic Math library operations such as “Round” in L2E queries are not supported at all by EF. The list goes on and on, but in short it EF v1 was simply released way too early, the current version should have been labelled “CTP” instead of “RTM”.

In contrast, L2S generates remarkably good and efficient SQL queries from just about any Linq expression thrown at it.

The future of Linq-to-SQL and Entity Framework/Linq-to-Entities

Microsoft made an announcement regarding Linq-to-SQL and Linq-to-Entities last week. Actually they made a lot of announcements regarding Linq-to-Entities last week during PDC. During one session the EF program manager, Tim Mallileau showed what they’re working on for future versions of Entity Framework. A lot of the “new” stuff in EF vNext are features that already exist and work in Linq-to-SQL such as deferred loading, POCO support with object-relational mappings described as attributes instead of in a bloated embedded EDMX file. That’s all nice, and shows that they have learnt at least a few things from the EFv1 fiasco, so now they’re trying to merge what is already there in L2S into EF. Wouldn’t it be smarter to cut the losses on EF and instead continue forward with Linq-to-SQL? At a minimum, opening up the provider model in L2S would be a nice first step…

So instead, they came out with an announcement that a large part of the L2S user base interpreted as a death certificate for Linq-to-SQL. Read it here: http://blogs.msdn.com/adonet/archive/2008/10/29/update-on-linq-to-sql-and-linq-to-entities-roadmap.aspx

A lot of heated discussions and upset blog entries ensued, people asked “Has Microsoft really killed Linq-to-SQL?” on stackoverflow etc. As always, Roger Jennings does a great job at summarizing what is being written around the blogosphere so head over to http://oakleafblog.blogspot.com/ for a summary…

Is Linq-to-SQL dead?

So is Linq-to-SQL dead? No, Linq-to-SQL is not dead. Heck no. (to paraphrase some ’softies). Microsoft can’t kill it. Linq-to-SQL is built into the .net framework as System.Data.Linq etc and it will stay there. To my knowledge, so far they haven’t ever removed/killed any features from the .net framework, but some namespaces don’t get a lot of new stuff added to them. That doesn’t mean they’re dead.

What I think the statement from Microsoft can be interpreted as is: “We’re not going to commit resources to improving Linq-to-SQL because EF is our big strategic object-relational mapping tool.“. And that is not really a big problem. Linq-to-SQL works really good out of the box, from a developer’s perspective it is easy to get started with, easy to use and really speeds up development of database-driven apps. Sure there are areas that can be improved, but the majority of those things can be solved without Microsoft having to make changes in the framework. More importantly, Linq-to-SQL is ready to use in app development today while EF is still really just a CTP release although Microsoft keeps touting it as RTM.

New features for the designer is one good example where L2S can be improved without the help of Microsoft, and I think I have already demonstrated that it can be done as VS addins. Code generation extensibility has been covered by Damien Guard in his T4 templates for L2S. There are plenty of runtime enhancements and code samples out there as well. Linq-to-SQL fans just need to follow Roger Jenning’s blog and they’ll find everything there is to find about Linq-to-SQL. So as long as Microsoft takes care of the occassional bug that might surface in L2S that’s fine and I’m sure they will. Although the documentation don’t mention it, they seem to have included a bunch of improvements in .net framework 3.5 - for example queries with multiple aggregates now come out clean and much more optimized than they did in the RTM version.

Oslo - the city with the most expensive beer in the world

Another interesting thing that was announced and demonstrated at MS PDC last week is something called “Oslo”. Besides being a city with the most expensive beer in the world, “Oslo” is also the code name for a new set of modelling tools from Microsoft, slated for release in 2010. I first heard about it a couple of months ago when I got an email from Don Box asking me if I ever considered working for Microsoft.

I have considered that a couple of times, the first time in 1997 or 1998 when I was interviewed for a job at Microsoft in Stockholm. That time I went for five separate in-person interviews (I lived 500km from Stockholm at the time), one which included filling out a “personality test” that bore a resemblance to the OCA test. They (MSFT) still couldn’t make up their mind after the fifth interview so I cut it short.

But since I got an email from a famous guy I did consider working for MSFT once again and so I attended a couple of phone interviews. This time there was no personality test involved so either they got rid of that test or maybe it was just something used by Microsoft in Stockholm. This time I got to the third interview, the tech interview. I botched it on a couple of questions on swapping items in linked lists, and on QSort. Don’t know what I can blame that on but I guess it being 7am and me not being a morning person is one semi-valid excuse. Anyway, they were cool about it and said it had to do with budget constraints and not me messing up the linked list… :) Budget constraints at Microsoft? That’s like snow in Bangkok…

So no, I’m not going to become a ’softie. Anyway, the whole experience got me interested in knowing more about “Oslo” (mainly because Don Box is leading the Oslo team, making it highly likely that something cool has to come out of it), so I downloaded the Oslo SDK CTP when it was released last week.

Oslo includes a graphical modelling tool, Quadrant (which is not included in the CTP download), a text-based modelling language called M, some lower level languages for defining domain specific languages, a repository design pattern that appear to be aimed at ensuring the generated databases will play nice in their new “Cloud” services platform, called “Windows Azure” (with a worse nickname). The “Oslo” tools are still in their infancy so there’s a lot of stuff missing, but it is still interesting to play around with it.

If you are interested/active in model driven development, drift over to the msdn Oslo Developer Center,  download the SDK, take it for a test spin, browse the Oslo msdn forum and let them know what features you would like to see in Oslo.

Personally, I am hoping that this time around MSFT will release a modelling tool that [on the database modelling side]:

  • is at least as easy to use as Visio, and as flexible as ERStudio and ERWin together
  • supports all basic/common data modelling scenarios
  • will successfully be able to reverse-and-forward engineer at a minimum Northwind and AdventureWorks without losing any database objects
  • will handle versioning of data models and schemas; schemas evolve over time in any system so the ability to update an existing database with model changes is a fairly basic requirement in my opinion
  • will be flexible enough to support the different naming conventions commonly in use

Now I’m off to check if I got a fish or if the missing bottle/tackle has surfaced anywhere. If not, I’m going to switch to net-fishing next week.

2 September 2008

Tools - Part 5 - Add-ins - Documentation features in Entity Framework vs Linq-to-SQL

Filed under: Tools — Tags: , , , , — Kristofer @ 14:16

EF Designer - Documentation Features

A couple of weeks ago I mentioned that the ADO.NET Entity Framework designer has a new documentation field that allows entity types and members (classes and properties) to be documented right in the designer.

Documentation fields in the EF designer's property dialog

Documentation fields in the EF designer's property dialog

I think this is a useful feature - once descriptions have been typed in for all classes and properties, the code editor will show the descriptions in the tooltip for the relevant classes and properties.

Tooltip on an entity type in the Visual Studio code editor

It has a cosmetic flaw: In the haste to release VS2008 and .net 3.5 SP1 it seems like MSFT forgot to apply the entity type documentation to the entity set accessor properties, so entity sets (and all undocumented classes and properties) will show the default “There are no comments for [name] in the schema.“. I would like to see the EntitySet using the same documentation as the EntityType, or at least have a “Documentation” property of its own. (It does in the EDMX, but not in the designer)

EntitySet tooltip

The “There are no comments” default value is a bit redundant and waste of screen real-estate. Why not leave the default value blank? Oh well, this is just version one so I guess that will be improved in some future service pack update.

Also missing is the ability to automatically populate the documentation field with descriptions from data model diagrams and/or the database. The descriptions may already exist in a data model diagram or in the database. In addition, knowing what indexes exist is very useful when writing queries, and what indexes exist on tables etc are of course available from the database so being able to automatically retrieve those two pieces of information from the db would be a nice-to-have feature.

Linq-to-SQL documentation features

Being a fan of the more stable and faster OR mapper Linq-to-SQL, I would like to be able to have the same kind of inline/tooltip documentation for my L2S classes and members. However, the L2S designer does not yet have the “Documentation” field that the EF designer do. I wonder why…

Maybe MSFT will add the documentation fields to some future version of the L2S designer, and maybe they will also add coupling to build documentation from the database. I’m not going to hold my breath while waiting for this so instead I decided to add this as a new feature in the Huagati DBML Tools add-in for Visual Studio.

In brief, it works like this:

The add-in has a new menu option called “Update Linq-to-SQL documentation from database”:

New menu option for updating L2S documentation from the database

New menu option for updating L2S documentation from the database

When the new documentation feature is used, the add-in will retrieve table and column descriptions from the database, along with index definitions for all tables.

SQL Server Management Studio

The add-in updates the L2S generated code with summary xml comments and description attributes for all entities/classes, members/properties and data context entity accessor properties. I would have preferred to keep the documentation in a separate file, but unfortunately I don’t see a way to document member properties from separate partial classes so for now it is done inside the generated file. Not a big problem since the documentation can be re-generated at any time by just selecting the “update” menu option.

The end-result is that I can now see both descriptions and index information in-line while writing code against the L2S generated classes.

Voilà! Finally some nice descriptions and db index information right in the code editor.

Linq-to-SQL datacontext's table accessor property with tooltip

Linq-to-SQL generated entity type with tooltip

Tooltip for member property in Linq-to-SQL generated class 

Availability

The Linq-to-SQL documentation feature is available in version 1.20 of the Huagati DBML Tools add-in that can be downloaded from http://www.huagati.com/dbmltools/ . Users of previous versions can upgrade for free by downloading the latest version and installing it over the existing version. New users can download the add-in and get a trial license or a full license from the same page.

5 August 2008

Tools - Part 3 - Add-ins - Writing optimized queries

Filed under: Tools — Tags: , , , — Kristofer @ 17:23

Every now and then I am commissioned to do database performance investigations for clients. This usually happens when they have a production system that has finally reached the point where users get frustrated, or - even worse - the system stops performing vital tasks because it is overloaded.

Configuration

There are a number of things that can cause database performance problems but there are a few causes I encounter much more frequently than others. Hardware configuration, OS configuration and database configuration are the first things I take a look at together with perfmon data to get an overview over where there may be bottlenecks. On Windows/SQL Server systems running large databases the most commonly overlooked or omitted configuration issues I encounter are:

  • Incorrect memory configuration - systems with >3Gb RAM without using the appropriate combination of /3Gb, /PAE, and /UserVA boot flags. SQL Server can make really good use of every little chunk of memory it is allowed to access so on some systems just setting these options so they match the environment have provided a “night and day difference”.
  • Less than optimal utilization of available disk bandwidth. Disk bandwidth is the most scarce resource on any database server so utilizing the available disk bandwidth by using the correct raid configuration, placing data files, transaction logs, tempdb on the right spindles etc can make a huge difference.
  • Non-vital services running on the database server; antivirus, secondary [test] instances of sql server, unused web servers, indexing services and other ‘resource thieves’.

All these are issues that should have been taken care of by a DBA or sys admin [if there is a dba or sys admin]. Either there hasn’t been a DBA involved in setting up systems plagued by these issues, or the DBA/sysadmin does not have the knowledge or resources needed to perform his/her duties, or the configuration has changed since the DBA was originally involved. Whatever the cause, these simple configuration issues are things that I have stumbled over on many production systems and a couple of changes and reboots later the systems are a lot snappier. I’m not going to delve into the details of these to not go too far off-topic, and besides there are so many articles and blogs entries covering these aspects already so there is no need to repeat it.   [However, if you need assistance with reviewing configuration on an existing system, feel free to contact me for a quotation :) ]

Bad queries

Once I have looked at the basic configuration settings [along with perfmon and profiler data to determine the optimum configuration for a specific system] I normally take a closer look at the user activity on the system, i.e. queries, connections etc from the various client- and reporting applications typically involved. This is the area where I frequently find the worst offenders. Even a single report query, innocently added to address some business need, have been proved over and over to bring otherwise optimally performing databases on its’ knees (or even below the floorboards).

In a perfect world, which applications should have access to production databases and all queries allowed into a production system are strictly controlled and everything new is thoroughly reviewed before being allowed into a production environment. In the real world, business requirements, time- and resource constraints tend to make individuals and organizations break these rules. So in the real world, where do the queries that run against production databases originate from? A few common sources are:

  • Code written by programmers that understand how the database works, and how a query will be executed usually take care to write good, optimal queries that have been thoroughly tested and reviewed to ensure that criteria and groupings use the right indexes, that joins will be picked up in the right order by the optimizer, that the necessary indexes exist in the first place, that the overall execution plan is optimal etc. These guys rarely cause a problem, and when they do it is usually because of time constraints that cause them to write sloppy code.
  • Code written by programmers that don’t understand how the database works, and don’t have the interest, time, or ability to gain that knowledge/understanding. Some of the people that fall in this category know their own limitations and get their work reviewed, others don’t and can - with a few lines of code - bring disaster into a system that is otherwise in harmony with its’ enviroments. Some organizations have mechanisms in place to catch the work of this group - code reviews, test cycles with dba reviews etc while others don’t.
  • Business intelligence / custom reporting. I’m not disputing the importance of these tools to address business needs, but the easier these tools become to use for laymen, the more dangerous they [potentially] become to the systems they interact with. Again, in the ‘perfect world’ scenario, all BI and custom reporting are kept on separate systems - whether log-shipped or replicated copies of the production database or separate data warehouse databases with pre-aggregated data -  well away from the transactional production systems. In the ‘real world’ they sometimes interact directly with the production system database(s) where a seemingly innocent query dragged-and-dropped together in a query builder tool can bring a system on its knees.
  • Add-on web interfaces; web pages or web services - not part of the core system - that provide customers, suppliers or other external parties access to certain kinds of information from otherwise internal systems. These tend to be written by in-house IT department developers to add functionality to systems provided by ISVs, and if the people writing them either fall in category #2 or don’t have enough knowledge about the system they are interfacing with these seemingly innocent on-the-sidelines mini apps can in the real world be the root cause of serious performance problems.

Tools

If you have made it this far, you are probably asking yourself: why this rant in an article in the tools category?

The simple answer: addressing the problem of poorly written queries early on in a development cycle makes it easier to avoid them altogether. And the best way to avoid them is by using the right kind of tools.

Linq to SQL out of the box gives us type safety and intellisense when writing queries, which vastly improves productivity when writing code that interacts with the database. It makes the queries known to the compiler already at compile time in the vast majority of cases. [dynamic queries excluded].

In the past, bad queries had to be caught after they were written; in code reviews, dba reviews, using profiling tools such as SQL Server Profiler on test- and production systems etc. With Linq to SQL this is even more true when using the out-of-the box tools. In fact, the main argument used by many Linq to SQL detractors is that it takes control over query composition and “hides” the queries away from programmers and DBAs. This is true when using the out-of-the-box [first] version of Linq2SQL (and EF), but that doesn’t mean it have to be that way, and it does not necessarily mean that taking SQL composition away from the programmers or DBAs have to be a negative thing.

So, to address this problem I am working on a couple of additional tools that will plug right into Visual Studio together with the Huagati DBML Tools add-ins and add some new functionality.

Tool #3: DBML Explorer

I will start the numbering with #3 as there are already two tools included in the Huagati DBML Tools toolkit. The DBML Explorer will be the third tool to be added to the toolkit.

Today, Visual Studio gives programmers access to the database objects through the Server Explorer sidebar, and access to the ORM side of it through the class designer(s) and properties dialog. The properties dialog also shows some basic database attributes such as table name, column data types etc. However, neither of the two gives a unified view over both, and on top of that neither are particularly easy to navigate for large databases.

The DBML Explorer is intended to overcome this and bridge the gap. It combines the information from the ORM side (DBML file / Linq2SQL designer) with schema information; table details, column details, index information, foreign keys etc. With this information at hand directly in Visual Studio it will be easier to write Linq queries against the database while being aware of what indexes are available for the query, comparative size of tables etc.

Preview Screenshot of the DBML Explorer

Preview Screenshot of the DBML Explorer

Tool #4: DBML Query Builder

Although I don’t like using query builder tools myself - I actually enjoy  writing SQL (and nowadays LINQ) queries - I can see the productivity boost that a query builder tool can bring to those who are not familiar or comfortable with query language syntax. This tool extends the DBML Explorer with functionality to quickly select the relevant tables and columns, join types, groupings/aggregates, criteria, formulas and then let the tool compose the Linq query with on-screen advice on index usage etc. This tool will help beginners get up to speed with LINQ syntax as well as save typing for those who are not amused by writing queries.

There are already some Linq query builder tools out there, but what I intend to address with this one is how the final query will interact with the database. If it will end up doing a table scan on a large table [or several large tables] this should be visualized already while the query is being composed in the query builder.

Tool #5: LINQ Query Analyzer

Since Linq has already given us productivity enhancing features such as type safety and intellisense when writing queries, why not take it a step further? Since the query is no longer stuffed away in a string literal but is now part of the code, some basic things; index utilization, join order and other parts of the execution plan can be checked early on, inside Visual Studio, any time between the query is written and compiled. This means that bad queries can be caught already at design time in Visual Studio instead of later in SQL Server Profiler while troubleshooting performance issues in a production environment.

Preview Screenshot of the Linq2SQL Query Analyzer output in Visual Studio

Preview Screenshot of the Linq2SQL Query Analyzer output in Visual Studio

 

Availability of the tools

These additional tools will be included in the Huagati DBML Tools package as they become ready to use. The DBML Explorer is first up and it is scheduled to be released later this week. The other two tools don’t have a release date yet but I will update this post when the date is known.

Existing license holders

Existing license holders of the tools will be able to upgrade to a new version including the additional tools free of charge.

 

Note: This posting is provided “AS IS” with no warranties, and confers no rights.

17 July 2008

Tools - Part 2 - Add-ins - DBML synchronization add-in updated

Filed under: Tools — Tags: , , — Kristofer @ 18:45

This is just a short release notification covering the latest updates to the DBML refresh add-in.

The add-in now detects the following database changes and will apply those to the dbml:

  • Added columns
  • Dropped columns
  • Dropped tables
  • Primary key changes
  • Auto generated column changes
  • Data type changes
  • Added foreign keys
  • Dropped foreign keys [new feature added on 22 July 2008]
  • New tables [new feature added on 23 July 2008]

Still missing (but coming in a future version) are:

  • New tables [feature was added on 23 July 2008]
  • New, modified or dropped stored procedures
  • Dropped foreign keys [feature was added on 22 July 2008]

The add-in is described in more detail in the “Tools Part 1″ article and can be downloaded from http://www.huagati.com/dbmltools/

 

The current version is 1.0.3120.33307, dated 17 June 2008 18:30.

Updated on 22 July 2008: Added support for dropped foreign keys and fixed a couple of issues related to recursive FKs (referencing the same table). The current version is 1.0.3125.30798 dated 22 July 2008 17:06.

Update: 23 July 2008 - added support for new tables, fixed a pile of reported and unreported bugs and re-tested against the reservation db and Northwind. The current version is 1.0.3126.38442 timestamped 23 July 2008 21:21.

Update: 26 July 2008 - numerous fixes and adjustments. Added licensing model; the free edition now supports up to 20 tables per DBML file. For larger DBML files, a license can be acquired from http://www.huagati.com/dbmltools/

Update: 30 July 2008 - added options dialog allowing the user to select what parts to synchronize (new/dropped tables, new/dropped/modified columns, PKs, new/dropped FKs). These options can be persisted along with a table exclusion list for unwanted tables. Also added a user guide detailing how to install and use the add-ins. Updates can be downloaded from http://www.huagati.com/dbmltools/

7 July 2008

Tools - Part 1 - Add-ins - Maintaining naming conventions and keeping the Linq2SQL DBML in sync with the database

Filed under: Tools — Tags: , , — Kristofer @ 12:43

As with many new technologies, there are a couple of missing features in Linq2SQL and the DBML designer. One feature that would really help during development is the ability to re-sync the DBML with the database so any data model changes can be incorporated in the DBML and the auto-generated entity classes.

At the moment, Microsoft’s tools provides three options for refreshing/resynchronizing the DBML with the database:

  1. Remove all, or only the changed entities and re-add them to the DBML designer
  2. Same as above, but use SQLMetal to re-generate the entire dbml
  3. Manually keep the dbml / Linq2SQL designer in sync with the database

The first two methods work fine if the naming conventions in the database match .net naming conventions, or if one prefer database naming conventions in the code. However, I always use all lowercase underscore separated table and field names in the database but pascal/proper case/.net entity names and member names in .net code. This more or less rules out options one and two when using out-of-the-box VS 2008. Re-generating the Linq2SQL diagram and then renaming entities and members by hand is a lengthy and error prone exercise. On the other hand, manually applying all changes both to the data model diagram and to the dbml is also duplicated work and also risk introducing errors/mismatches.

To remedy this, I will create a couple of Visual Studio add-ins that take over where Microsoft left off.

Add-in #1: Naming convention add-in

First off is an add-in for updating all entity and member names so they match .net naming conventions, e.g. the table airport will get a class name Airport, route_via will become RouteVia, the field airport.airport_code will simply be Airport.Code and so on. The following image shows the airport table before and after being updated by this add-in:

This image shows the intended change applied on the airport table

 

DBML is a fairly straightforward xml document where all tables, fields, entity classes and members are described with their database- and code attributes. Together with the dbml a layout file is also generated; this file is also an xml file and it stores the layout of the Linq2SQL designer diagram; position of tables, association lines etc. All this add-in need to do is go through those two files, update all entity and member names, update associations and then save the files to trigger the code generator to re-generate the entity classes.

The end result is that updating a Linq2SQL designer generated dbml, or sqlmetal generated dbml to use .net naming conventions after creating or refreshing the dbml from the database is as simple as hitting a button.

Add-in #2: Linq2SQL designer refresh add-in

Once the Linq2SQL base DBML has been generated it is important to keep the DBML in sync with any schema changes. If a couple of fields have been added to existing database tables, fields removed, tables removed, primary key changes, or foreign keys added or removed we may still not want to re-generate the entire DBML although this is the only option we’re left with when using the tool supplied by Microsoft.

To overcome this, I created a second add-in that will compare the dbml with the database schema of the database pointed to by the connection string embedded in the dbml and update the dbml accordingly. If any new columns have been added to tables represented in the dbml, columns removed, columns changed nullability or data type, PK members changed, foreign keys has been added/removed, or entire tables has been added or removed the add-in will detect the schema changes and apply them to the dbml.

Support for stored procedures will be added in a future version.

Downloading and installing the add-ins

The add-ins can be downloaded along with a user guide from http://www.huagati.com/dbmltools/

There are three editions; the free edition that support up to 20 tables, the Standard edition that supports up to 80 tables and the Professional edition that supports as many tables as Linq2SQL/DBML.

Using the add-ins

Whenever you have the DBML designer open in Visual Studio, the Tools menu will get two new options from the add-in:

VS 2008 Tools Menu with add-in menus

The first one, Standardize DBML Entity and Member names, will update all entity names as described above, and the second one, Update DBML diagram from database, will connect to the database and update the DBML file so it corresponds with the database schema.

Whenever either add-in is used, all changes done by the add-in will be recorded in the output window:

 

Update: 16 July 2008 - fixed bug related to auto-generated columns and added PK support. The latest version is 1.0.3119.33326 with a timestamp 2008-07-16 18:30

Update: 17 July 2008 - added support for added foreign keys. The latest version is 1.0.3120.33307 with a timestamp 2008-07-17 18:30

Update: 22 July 2008 - added support for dropped foreign keys and fixed a couple of issues with recursive tables (FKs referencing the same table). The current version is 1.0.3125.30798 with a timestamp 2008-07-22 17:06.

Update: 23 July 2008 - added support for new tables, fixed a pile of reported and unreported bugs and re-tested against the reservation db and Northwind. The current version is 1.0.3126.38442 timestamped 23 July 2008 21:21.

Update: 26 July 2008 - numerous fixes and adjustments. Added licensing model; the free edition now supports up to 20 tables per DBML file. For larger DBML files, a license can be acquired from http://www.huagati.com/dbmltools/

Update: 30 July 2008 - added options dialog allowing the user to select what parts to synchronize (new/dropped tables, new/dropped/modified columns, PKs, new/dropped FKs). These options can be persisted along with a table exclusion list for unwanted tables. Also added a user guide detailing how to install and use the add-ins. Updates can be downloaded from http://www.huagati.com/dbmltools/

 

Update: This article is somewhat outdated. See http://www.huagati.com/dbmltools/ for an up-to-date description of the add-in, its’ features, download link etc.

18 June 2008

Application Architecture - Part 1 - Data Access Layer - LINQ and Linq2SQL

Filed under: Architecture — Tags: , , — Kristofer @ 13:38

Although the data model is just in its earliest stages I will cover some application architecture topics in parallell. First off is the data access layer. As I am basing this on Microsoft technology using Microsoft tools there are some interesting new tools that Microsoft just made available in November 2007 when they released the .net framework 3.5, Visual Studio 2008 and C# 3.0.

LINQ, Language INtegrated Query

The first interesting part of .net 3.5 is LINQ, short for Language INtegrated Query. This is a combination of new .net framework classes and new language and compiler features. LINQ allows programmers to write queries directly in C# code (or other supported .net languages). I’ll stick to C# examples here. LINQ queries can target just about anything, objects, object collections, databases, xml etc. The great benefit of using LINQ is that it adds type safety, intellisense etc to the queries.

Linq2SQL

One of the extensions to LINQ that ships with .net 3.5 is LINQ2SQL. Linq2SQL includes an ORM (Object Relational Mapping) designer for SQL Server databases, and an entity and data context code generator. Basically it allows you to drag-and-drop tables, stored procedures etc from a SQL Server database onto the ORM designer inside Visual Studio, and it will automatically generate an object model diagram along with entity classes for all tables containing all fields and foreign keys as properties of matching data types.

The generated entities and members can be renamed using naming conventions more suitable in C# directly in the designer while maintaining a reference to the underlying names in the database, and additional properties, methods etc can be added to the entity classes as they are implemented as partial classes.

The Linq2SQL ORM DBML designer surface with entity classes corresponding to a few of the res system tables

The screenshot above shows the Linq2SQL ORM designer / DBML designer with a few of the entity classes corresponding to tables in the data model.

With the simple basic part of the data model created so far, the Linq2SQL ORM designer and code generator just saved me a lot of typing; the generated entity classes are 800kb+ and I now have an object model corresponding to my data model.

Once the data access layer object model is in place, LINQ queries can be written directly against the generated entity classes. Behind the scenes, Linq2SQL generates parameterized SQL queries, handles database connections, caching if needed, execute the queries and generates back objects matching the query results.

And the best part of it: intellisense support for all entities and entity members; one no longer have to memorize every detail of the data model or stare at a data model diagram while writing code. And as the query is written in C# against strongly typed objects, the visual studio syntax checker will catch syntax errors, type mismatches and other mistakes that does not show up until at runtime when using traditional SQL queries.

The query in the screenshot above,

    

  

 

(
from oca in dc.CityAirports
join rte in dc.Routes on oca.AirportCode equals rte.OriginAirportCode
join dca in dc.CityAirports on rte.DestinationAirportCode equals dca.AirportCode
where
oca.CityCode == originCityCode
  && dca.CityCode == destinationCityCode
 
&& oca.Airport.ActiveFlag == true
 
&& dca.Airport.ActiveFlag == true
 
&& oca.City.ActiveFlag == true
 
&& dca.City.ActiveFlag == true
select rte
).ToList<DAL.
Route
>();

…is automatically turned into parameterized SQL when the query is executed. Running SQL Profiler in parallell when testing the method above shows the following automagically generated SQL:

SELECT [t1].[route_id] AS [ID], [t1].[origin_airport_code] AS [OriginAirportCode], [t1].[destination_airport_code] AS [DestinationAirportCode]
FROM [dbo].[city_airport] AS [t0]
INNER JOIN [dbo].[route] AS [t1] ON [t0].[airport_code] = [t1].[origin_airport_code]
INNER JOIN [dbo].[city_airport] AS [t2] ON [t1].[destination_airport_code] = [t2].[airport_code]
INNER JOIN [dbo].[airport] AS [t3] ON [t3].[airport_code] = [t0].[airport_code]
INNER JOIN [dbo].[airport] AS [t4] ON [t4].[airport_code] = [t2].[airport_code]
INNER JOIN [dbo].[city] AS [t5] ON [t5].[city_code] = [t0].[city_code]
INNER JOIN [dbo].[city] AS [t6] ON [t6].[city_code] = [t2].[city_code]
WHERE ([t0].[city_code] = @p0) AND ([t2].[city_code] = @p1) AND ([t3].[active_flag] = @p2) AND ([t4].[active_flag] = @p3) AND ([t5].[active_flag] = @p4) AND ([t6].[active_flag] = @p5)

LINQ2SQL runs all queries using the sp_executesql stored procedure so they execution plans are all cached in SQL Server’s stored procedure execution plan cache for later reuse; something that saves cpu cycles for complex queries.

Cons?

The drawbacks of using LINQ2SQL? This is covered in hundreds of other blogs discussing the pros and cons of LINQ2SQL, commonly accusing it of ‘taking control away from DBAs’. In my experience (and I have used it in a medium sized project with some ~70 tables as the DAL; Hedgehog is 100% Linq2SQL based and has been running in production environments for months) there are just a few minor things that can all be worked around:

  1. First of all, it does not allow SQL optimizer hints such as join method hints, index hints, locking hints etc to be included in generated queries. However, in the rare cases when optimizer hints are required, those queries can be converted to stored procedures instead and Linq2SQL also supports stored procedures.
  2. Another ‘issue’ is locking/isolation mode. It only allows isolation mode to be set for queries that are run within a transaction context, so every little ‘dirty read’ / ‘read uncommited’ / ‘with (nolock)’ query needs to be run inside a transaction. No big deal really because the code for it can be hidden in extensions to the generated data context class but it would still be nice to be able to avoid the overhead of a database transaction for such queries.
  3. Queries that extensively use groupings and aggregates result in less than optimal generated SQL, so that kind of queries are also better to move out to stored procedures for now.
  4. Serialization support is poor; one has to go through hoops and tricks and avoid certain features in order to make the generated entity classes serializable using binary or XML serialization in .net.

Hopefully Microsoft will address those issue or at least some of them in .net 3.5 SP1 which is scheduled to be released sometime later this year.

The future

dotNet 3.5 SP1 will also contain ADO.NET entity framework, the next-generation version of Linq2SQL that allows for further abstraction between the object model and the database. EF also has better support for serialization which makes life easier for us n-tier developers. However, until SP1 RTM is out (a beta is already available for download from Microsoft) I will stick to Linq2SQL and possibly convert to EF once it has been officially released.

Powered by WordPress