Designing an airline passenger reservation system

15 November 2008

Data Model - Basics - Part 11 - The HuagatiRes data model converted to “M”

Filed under: Data Model, Tools — Tags: , , — Kristofer @ 18:35

Happy over Microsoft’s new commitment to data modelling, I have been playing around a bit with the “Oslo” CTP. It is time they come up with a replacement for Visio 2003 EA. The latest thing I tried was a conversion of the HuagatiRes datamodel into an M script. There’s not much to say about it - “M” and “Oslo” are still CTPs so some things can be defined in M models (e.g. table defs, foreign keys, PKs) while others can not be described in M (e.g. indexes, alias types, precision/scale on numeric/decimal types etc). I also haven’t figured out how to declare both nullability and length on members of type “Text”.   “:Text where value.Count <=100” works fine while “:Text? where value.Count <=100” will throw an error, but that might just be me not fully understanding the M syntax.

The M model for HuagatiRes can be downloaded from here:

http://blog.huagati.com/downloads/HuagatiResMmodel.zip

Now, I wasn’t stupid enough to actually type all of that into Intellipad. I did something even more stupid; I wrote a code snippet that extracts schema, type, table, column, foreign key, and index defs and generates M files.

If you’re interested in taking the schema->M tool (”Huagati MGen” a.k.a. “Aker Brygge”) for a test-spin, you can download it from here:

http://blog.huagati.com/downloads/HuagatiMGen.zip

It requires .net 3.5 SP1 and works against SQL Server 2005 and SQL Server 2008 databases only. Give it a connect string and an output folder and it will generate one M file per schema in the database you point it at, complete with code comments (based on object names and extended property descriptions) and all. Even [commented out] index definitions - prepared for the day when M gets support for index definitions…

Huagati MGen

 

MGen Output

12 November 2008

Offtopic: Huagati DBML/EDMX Tools charity donation

Filed under: off-topic — Tags: , , , , — Kristofer @ 17:46

Do you want to get a good deal on the Huagati DBML/EDMX Tools? Right now there’s a limited quantity of “charity donation” licenses available. This is a full, unlimited, single user license that you would normally have to pay USD $119.95 for. Now you can name the price, any price you want, and receive a license by donating that amount to a charity fundraising for a poor school in northern Thailand.

The charity in question is a fundraising event that my son’s school, the British School of Bangkok (BSB), is holding in benefit of the Ban kok noi wittaya School, a poor school in the Chiang Rai mountains in northern Thailand. They will use the funds raised through the event to buy solar panels for electricity generation and other materials and equipment needed.. This is direct-to-the-needy charity event.

The main part of the fundraising is a Christmas fair that will be held later this month for families and friends of BSB students. BSB’s financial director have a PayPal donation button on this page from a previous fundraising campaign: http://www.wilburygroup.com/banpakkwangschool/index.html

Some photos and more information about the recipent school is available here: http://www.wilburygroup.com/banpakkwangschool/march2008.html

Huagati Systems is contributing by giving away licenses for Huagati DBML/EDMX Tools to the first twenty people who make a donation via PayPal. Use the donation button in the middle of this page if you’re interested in a charity donation license.

Any amount will do. If you’re just looking for a cheap license, donate £1 (=$1.50). If you can afford a larger donation, please do. The first twenty people who make a donation of any amount between now and 22 November 2008 and then forward their PayPal receipt from the donation to licensing@huagati.com will receive a full, unlimited license for Huagati DBML/EDMX Tools as a thanks for the donation. Your license key will be emailed back to you within a business day.


 

Unless activated the license is fully transferrable so you can re-sell or give away your charity donation license if you want..

This offer is valid until the limited quantity of charity licenses have been sold out, or until 22 November 2008 (whichever comes first). One charity donation license per person.

Note that the donation page accepts funds in GBP. 1GBP = 1.50 USD / 1 USD = 0.67 GBP.

Total licenses: 20
Licenses claimed (to date): 1
Remaining: 19

5 November 2008

Architecture - Linq-to-SQL and set-based operations: Update statements

Filed under: Architecture — Tags: , , , , , , — Kristofer @ 15:59

Data manipulation with OR mappers usually ends up with records being to be pulled back from the database to the business logic tier or client, changes applied and record-based update statements generated by the OR mapper and sent back to the db server. This is fine when working with a small number of records or when the data is needed in a higher tier for the update to take place (e.g. a user editing a customer record).

However, there are situations where this is a less-than-ideal way of working with the data. When a large number of records need to be updated, e.g. all products in a category discontinued, seat “2A-2C” blocked on all future flights etc the data is usually not needed in the application layers. Instead, set based operations done directly in the database are the most efficient way to do these operations. Many ORMs, including Linq-to-SQL and Entity Framework do not support set-based operations so set-based operations instead result in the above mentioned record-based operations with ensuing unnecessary network traffic, extra work for the database server etc.

When using Linq-to-SQL out of the box there are two ways to achieve set-based operations where an update, delete or “insert into … select … from …” statement will be executed on the database without actually returning any data:

  • Stored procedures; the set based update/delete/insert is added to a stored procedure, the stored procedure is in turn added to the model and called as a method on the data context.
  • SQL-in-strings; the good old method of concatenating a SQL statement in a string and running it using ExecuteCommand.

Both methods work fine, but the drawback is that the language integration and LINQ flexibility is lost when using either of the methods.

Wouldn’t it be neat with an extension allowing set-based updates, deletes etc based on the L2S-generated entity classes?

I’m thinking something along the lines of this. I’m going to use Northwind as the example here instead of the res system db just to make the examples more simple to follow for everyone familiar with Northwind. Say we want to discontinue all products in a product category:

using (NorthwindDBML.DataClasses1DataContext dc = new DataClasses1DataContext())
{
    dc.Update<Product>(
        (where => where.Category.CategoryID == 1),
        (set => set.Discontinued == true)
        );
}

The first lambda expression passed to the Update method is the where clause for the update statement, and the second lambda defines the updates that will take place. The final result should be a SQL update statement along the lines of:

update dbo.Products  set [Discontinued] = 1 where ProductID in (SELECT [t0].[ProductID]
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN [dbo].[Categories] AS [t1] ON [t1].[CategoryID] = [t0].[CategoryID]
WHERE [t1].[CategoryID] = @p0)

Another example, say we want to add the IDD prefix “+49″ in front of the phone number for all German customers if they don’t already have the it. That would be expressed as:

dc.Update<Customer>(
    (where => where.Country == "Germany" && !where.Phone.StartsWith("+49")),
    (set => set.Phone == "+49" + set.Phone),
    LockModeEnum.Row);

…and would result in a SQL update statement like this:

update dbo.Customers with (rowlock) set [Phone] = (@p0 + [Phone]) where CustomerID in (SELECT [t0].[CustomerID]
FROM [dbo].[Customers] AS [t0]
WHERE ([t0].[Country] = @p1) AND (NOT ([t0].[Phone] LIKE @p2)))

To make it play nice, it takes a parameter for locking hints too. Isn’t that neat?

As already established, this is not supported in Linq-to-SQL out of the box. (and not supported in entity framework at all). So do we have to wait for Microsoft to maybe add that in a future version? No, won’t work. They already said that entity framework is their “preferred” data access technology despite L2S being more popular among their user base than EF so I don’t think they will spend time and money on implementing something like this.

Instead, I played around a bit and came up with the following. This is just a first version so there are probably bugs in it. In addition, if you have tables using reserved keywords such as “from”, “where” etc as the table-name or in column names then it will break. It may also break when used with future versions of L2S (if there will be any). That said, here’s the first implementation of set-based language integrated update statements using Linq-to-SQL data contexts and Linq-to-SQL generated classes.  (delete, “insert into … select … from …” will come in a future version).

Enjoy:

//Linq-to-SQL set-based operations. Copyright (c) 2008, Huagati Systems Co.,Ltd. ( www.huagati.com )
//DISCLAIMER: Use at your own risk, this is a code sample provided as-is with no warranties. By using the code below you take responsibility for any damage that may occur as a result.
//You are not allowed re-publish this code sample (including but not limited to articles, blogs, source code libraries etc) without prior written consent.
//You may however include it in in your own software provided that this header remains intact in the source code,
//  and that the main purpose or usage area of that software is not to add functionality to Linq-to-SQL.
//You may not modify it for usage with ADO.NET Entity Framework without prior written permission. If you do, your license to use it is automatically revoked.
using System;
using System.Linq;
using System.Data.Linq;
using System.Data.Linq.Mapping;
using System.Linq.Expressions;
using System.Collections.Generic;
using System.Reflection;
using System.Reflection.Emit;
using System.Text.RegularExpressions;
namespace System.Data.Linq.SetBased
{
    public enum LockModeEnum
    {
        NotSpecified,
        Row,
        Page,
        Table,
        TableExclusive,
        NoWait
    }
    public static class LinqSetbasedExtensions
    {
        public static void Update<T>(this DataContext dc, Expression<Func<T, bool>> selectionCriteria, Expression<Func<T, bool>> updateOperation)
        {
            Update<T>(dc, selectionCriteria, updateOperation, LockModeEnum.NotSpecified);
        }
        public static void Update<T>(this DataContext dc, Expression<Func<T, bool>> selectionCriteria, Expression<Func<T, bool>> updateOperation, LockModeEnum lockMode)
        {
            IQueryable<T> baseQuery = ((IQueryable<T>)dc.GetTable(typeof(T))).Where(selectionCriteria);
            Update<T>(dc, baseQuery, updateOperation, lockMode);
        }
        public static void Update<T>(this DataContext dc, IQueryable<T> baseQuery, Expression<Func<T, bool>> updateOperation, LockModeEnum lockMode)
        {
            //get the type representing the table we're dealing with
            Type baseType = typeof(T);
            //get primary key members for the type we're dealing with
            List<MetaDataMember> primaryKeyMembers = GetPKMembers(dc, baseType);
            if (primaryKeyMembers.Count == 0)
            {
                throw new InvalidOperationException(baseType.Name + " has no primary key members.");
            }
            //create a new class containing only the PK members of the table we're going to update
            MemberBinding[] memberBindings = null;
            ParameterExpression[] baseTypeParams = new ParameterExpression[] { Expression.Parameter(typeof(T), "") };
            Type pkType = CreatePKClass(typeof(T), primaryKeyMembers, baseTypeParams[0], out memberBindings);
            //change the query to only get the PK members
            IQueryable pkQuery = GetPKQuery(baseQuery, pkType, baseTypeParams, memberBindings);
            //extract the select query for the primary key members
            System.Data.Common.DbCommand pkSelectCommand = dc.GetCommand(pkQuery);
            string whereClause = null;
            if (primaryKeyMembers.Count == 1)
            {
                whereClause = "where " + primaryKeyMembers[0].Name + " in (" + pkSelectCommand.CommandText + ")";
            }
            else
            {
                string ct = pkSelectCommand.CommandText;
                whereClause = ct.Substring(ct.IndexOf("\r\nfrom ", StringComparison.InvariantCultureIgnoreCase));
            }
            //generate a select query with the update expression as the where clause
            IQueryable<T> updateOPQuery = ((IQueryable<T>)dc.GetTable(typeof(T)).AsQueryable());
            updateOPQuery = updateOPQuery.Where(updateOperation);
            //get the table name
            string qualifiedTableName = dc.Mapping.GetTable(typeof(T)).TableName;
            //extract the where clause from the update operation query
            System.Data.Common.DbParameterCollection setClauseParameters;
            string updateClause = GetSetClauseFromWhere(dc.GetCommand(updateOPQuery), qualifiedTableName, out setClauseParameters);
            //concatenate the update statement
            string updateStatement = ReplaceSQLParameters(
                "update " + qualifiedTableName + " " + GetLockHint(lockMode)
                + " set " + updateClause
                + " " + whereClause);
            //merge parameter groups into a single array
            object[] parameters = MergeParameters(setClauseParameters, pkSelectCommand.Parameters);
            //excute the update
            dc.ExecuteCommand(updateStatement, parameters);
        }
        private static List<MetaDataMember> GetPKMembers(DataContext dc, Type baseType)
        {
            //get primary key members for a L2S class/type
            return
                (
                    from pkMember in dc.Mapping.GetTable(baseType).RowType.DataMembers
                    where pkMember.IsPrimaryKey == true
                    select pkMember
                ).ToList<MetaDataMember>();
        }
        private static IQueryable GetPKQuery(IQueryable baseQuery, Type pkType, ParameterExpression[] baseTypeParams, MemberBinding[] memberBindings)
        {
            //generate a lambda for creating a new instance of the PK type, mapped to the base type for the table
            LambdaExpression lambda
                = Expression.Lambda(
                    Expression.MemberInit(
                        Expression.New(pkType),
                        memberBindings),
                    baseTypeParams
                  );
            //create a new query returning only the PK members as instances of the new pk type
            return
                baseQuery.Provider.CreateQuery(
                    Expression.Call(typeof(Queryable),
                            "Select",
                            new Type[] { baseQuery.ElementType, lambda.Body.Type },
                            baseQuery.Expression,
                            Expression.Quote(lambda)
                    )
                );
        }
        private static Type CreatePKClass(Type baseType, List<MetaDataMember> primaryKeyMembers, ParameterExpression baseTypeParam, out MemberBinding[] memberBindings)
        {
            //generate a new assembly and class
            AssemblyBuilder assembly
                = AppDomain.CurrentDomain.DefineDynamicAssembly(
                    new AssemblyName("PKClasses"), AssemblyBuilderAccess.Run);
            ModuleBuilder moduleBuilder = assembly.DefineDynamicModule("PKModule");
            TypeBuilder typeBuilder = moduleBuilder.DefineType("PKClass_" + baseType.Name, TypeAttributes.Class | TypeAttributes.Public);
            //add the primary key members to the new type
            foreach (MetaDataMember pkMember in primaryKeyMembers)
            {
                string memberName = pkMember.Name;
                Type memberType = pkMember.Type;
                //create a private field for storage
                FieldBuilder storageField = typeBuilder.DefineField("_" + memberName, memberType, FieldAttributes.Private);
Â
                //add a get method, returning the storage field
                MethodBuilder propertyGet
                    = typeBuilder.DefineMethod("get_" + memberName,
                        MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig,
                        memberType, Type.EmptyTypes);
                ILGenerator getIL = propertyGet.GetILGenerator();
                getIL.Emit(OpCodes.Ldarg_0);
                getIL.Emit(OpCodes.Ldfld, storageField);
                getIL.Emit(OpCodes.Ret);
Â
                //add a set method, setting the storage field
                MethodBuilder propertySet
                    = typeBuilder.DefineMethod("set_" + memberName,
                        MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig,
                        null, new Type[] { memberType });
                ILGenerator setIL = propertySet.GetILGenerator();
                setIL.Emit(OpCodes.Ldarg_0);
                setIL.Emit(OpCodes.Ldarg_1);
                setIL.Emit(OpCodes.Stfld, storageField);
                setIL.Emit(OpCodes.Ret);
                //create a public property for the member
                PropertyBuilder memberProperty = typeBuilder.DefineProperty(memberName, System.Reflection.PropertyAttributes.HasDefault, memberType, null);
                //assign the get/set methods to the property
                memberProperty.SetGetMethod(propertyGet);
                memberProperty.SetSetMethod(propertySet);
            }
            //return the type of the generated class
            Type pkClass = typeBuilder.CreateType();
            //get member bindings mapping the members of the pk class to the corresponding members in the base class
            //(out param)
            memberBindings =
                (
                from pkMember in primaryKeyMembers
                select (MemberBinding)Expression.Bind(pkClass.GetProperty(pkMember.Name),
                            Expression.Property(baseTypeParam, baseType.GetProperty(pkMember.Name)))
                ).ToArray<MemberBinding>();
            //return the new class' type
            return pkClass;
        }
        private static string GetLockHint(LockModeEnum lockMode)
        {
            //return a lock hint based on the specified lock mode
            switch (lockMode)
            {
                case LockModeEnum.NotSpecified:
                    return "";
                case LockModeEnum.Row:
                    return "with (rowlock)";
                case LockModeEnum.Table:
                    return "with (tablock)";
                case LockModeEnum.TableExclusive:
                    return "with (tablockx)";
                case LockModeEnum.Page:
                    return "with (paglock)";
                case LockModeEnum.NoWait:
                    return "with (nowait)";
                default:
                    return "";
            }
        }
        private static string GetSetClauseFromWhere(System.Data.Common.DbCommand updateOperationCommand, string qualifiedTableName, out System.Data.Common.DbParameterCollection commandParameters)
        {
            //get just the criteria part of the query
            string commandText = updateOperationCommand.CommandText;
            commandText = commandText.Substring(commandText.IndexOf(qualifiedTableName) + qualifiedTableName.Length);
            commandText = commandText.Substring(commandText.IndexOf("where", StringComparison.InvariantCultureIgnoreCase) + 5);
            //new stringbuilder for generating the set clause
            System.Text.StringBuilder ctBuilder = new System.Text.StringBuilder();
            //replace 'and' and 'or' with comma separators
            string[] separators = { " and ", " or ", " AND ", " OR "};
            foreach (string statementPart in commandText.Split(separators, StringSplitOptions.None))
            {
                //remove unnecessary parenthesis wrappers
                string sp = statementPart.Trim();
                if (sp.StartsWith("(") && sp.EndsWith(")"))
                {
                    sp = sp.Substring(1, sp.Length - 2);
                }
                //remove "[t0]." table alias
                sp = sp.Replace("[t0].", "");
                //add to the string builder
                ctBuilder.Append(sp);
                ctBuilder.AppendLine(", ");
            }
            ctBuilder.Remove(ctBuilder.Length - 4, 4);
            //return the command parameters in the out param variable
            commandParameters = updateOperationCommand.Parameters;
            //return the where-clause turned into a set clause
            return ctBuilder.ToString();
        }
        private static string ReplaceSQLParameters(string sqlCommand)
        {
            //replace @0, @1, etc with {0}, {1} etc...
            Regex paramMatching = new Regex(@"@p\d", RegexOptions.Compiled);
            int paramNo = 0;
            while (paramMatching.IsMatch(sqlCommand))
            {
                sqlCommand = paramMatching.Replace(sqlCommand, "{" + paramNo.ToString() + "}", 1);
                paramNo++;
            }
            return sqlCommand;
        }
        private static object[] MergeParameters(params System.Data.Common.DbParameterCollection[] parameterGroups)
        {
            //merge parameter groups into a one-dimensional object array. (nested loops as the dbparametercollection don't implement IEnumerable)
            List<object> parameters = new List<object>();
            foreach (System.Data.Common.DbParameterCollection parameterGroup in parameterGroups)
            {
                foreach (System.Data.Common.DbParameter parameter in parameterGroup)
                {
                    parameters.Add(parameter.Value);
                }
            }
            return parameters.ToArray();
        }
    }
}

3 November 2008

Quiet blog, gone fishing, is Linq-to-SQL dead?, etc

Filed under: off-topic — Tags: , , , , , , — Kristofer @ 1:48

It has been a while since I posted anything on the blog. Did I stop blogging? Nope, just been having fun with other stuff. Like optimizing indexes for an oracle database, adding new features to my L2S/EF toolkit etc.

Gone Fishing

Right now I am “busy” fishing. It is Sunday and we’re at our vacation house on the countryside 130km to the west of Bangkok, where we have a small lake full of fish. I have been trying to catch a fish since yesterday but so far it is 2-0 to the fish. Over night, two tackles and hooks just disappeared. One large triple pointed hook attached to a steel wire and baited with a chicken skin is gone, the steel wire had snapped or been cut off. So I either got a very large fish with razor sharp teeth hanging out in the lake, a fish with a wire cutter, or maybe our resident monitor lizard got it.

The second missing tackle is also a triple pointed hook attached to half litre water bottle. I use water bottles as floats as they allow the fish to swim around while making it easy for me to find them. Until now. I have checked everywhere, and it is just gone. Not in the lake, and not anywhere in the irrigation canals out in the back of the garden. If the monitor lizard got it I would have expected to find it outside of her nest but no trace there either. (I know it’s a ’she’ because she laid eggs a few months ago. Unfortunately someone poached the nest a couple of weeks later so we won’t get any cute baby monitors. Monitor lizard eggs are turned into “performance enhancing” omelet by the locals.)

Some time whatever animal got it will have to surface, no fish can stay under water forever with an extra half kilo bouyancy. One remote possibility is that I got a fish on it and the fish got eaten by something larger. Like a python. They’re not resident (they’re more like consultants :) ) so if a python got it, it has probably already left our garden. If anyone comes across a large python dragging around a Minere bottle attached to a string, please return it to me. :)

 

EF performance

Enough off-topic stuff. I promised a while ago that I would write something about Microsoft’s new ADO.NET Entity Framework and performance. I was planning to do that, but I feel it is no longer necessary. There has been a lot written about it already and it isn’t pretty. Somehow they forgot to test EF and L2E beyond “Hello World”. Complex LINQ-to-entities queries are translated into really poor SQL. Less complex queries like a simple “from person p in oc.Persons where p.LastName.StartsWith(lastName) select p” are translated into equally bad SQL that will always force a table scan. Basic Math library operations such as “Round” in L2E queries are not supported at all by EF. The list goes on and on, but in short it EF v1 was simply released way too early, the current version should have been labelled “CTP” instead of “RTM”.

In contrast, L2S generates remarkably good and efficient SQL queries from just about any Linq expression thrown at it.

The future of Linq-to-SQL and Entity Framework/Linq-to-Entities

Microsoft made an announcement regarding Linq-to-SQL and Linq-to-Entities last week. Actually they made a lot of announcements regarding Linq-to-Entities last week during PDC. During one session the EF program manager, Tim Mallileau showed what they’re working on for future versions of Entity Framework. A lot of the “new” stuff in EF vNext are features that already exist and work in Linq-to-SQL such as deferred loading, POCO support with object-relational mappings described as attributes instead of in a bloated embedded EDMX file. That’s all nice, and shows that they have learnt at least a few things from the EFv1 fiasco, so now they’re trying to merge what is already there in L2S into EF. Wouldn’t it be smarter to cut the losses on EF and instead continue forward with Linq-to-SQL? At a minimum, opening up the provider model in L2S would be a nice first step…

So instead, they came out with an announcement that a large part of the L2S user base interpreted as a death certificate for Linq-to-SQL. Read it here: http://blogs.msdn.com/adonet/archive/2008/10/29/update-on-linq-to-sql-and-linq-to-entities-roadmap.aspx

A lot of heated discussions and upset blog entries ensued, people asked “Has Microsoft really killed Linq-to-SQL?” on stackoverflow etc. As always, Roger Jennings does a great job at summarizing what is being written around the blogosphere so head over to http://oakleafblog.blogspot.com/ for a summary…

Is Linq-to-SQL dead?

So is Linq-to-SQL dead? No, Linq-to-SQL is not dead. Heck no. (to paraphrase some ’softies). Microsoft can’t kill it. Linq-to-SQL is built into the .net framework as System.Data.Linq etc and it will stay there. To my knowledge, so far they haven’t ever removed/killed any features from the .net framework, but some namespaces don’t get a lot of new stuff added to them. That doesn’t mean they’re dead.

What I think the statement from Microsoft can be interpreted as is: “We’re not going to commit resources to improving Linq-to-SQL because EF is our big strategic object-relational mapping tool.“. And that is not really a big problem. Linq-to-SQL works really good out of the box, from a developer’s perspective it is easy to get started with, easy to use and really speeds up development of database-driven apps. Sure there are areas that can be improved, but the majority of those things can be solved without Microsoft having to make changes in the framework. More importantly, Linq-to-SQL is ready to use in app development today while EF is still really just a CTP release although Microsoft keeps touting it as RTM.

New features for the designer is one good example where L2S can be improved without the help of Microsoft, and I think I have already demonstrated that it can be done as VS addins. Code generation extensibility has been covered by Damien Guard in his T4 templates for L2S. There are plenty of runtime enhancements and code samples out there as well. Linq-to-SQL fans just need to follow Roger Jenning’s blog and they’ll find everything there is to find about Linq-to-SQL. So as long as Microsoft takes care of the occassional bug that might surface in L2S that’s fine and I’m sure they will. Although the documentation don’t mention it, they seem to have included a bunch of improvements in .net framework 3.5 - for example queries with multiple aggregates now come out clean and much more optimized than they did in the RTM version.

Oslo - the city with the most expensive beer in the world

Another interesting thing that was announced and demonstrated at MS PDC last week is something called “Oslo”. Besides being a city with the most expensive beer in the world, “Oslo” is also the code name for a new set of modelling tools from Microsoft, slated for release in 2010. I first heard about it a couple of months ago when I got an email from Don Box asking me if I ever considered working for Microsoft.

I have considered that a couple of times, the first time in 1997 or 1998 when I was interviewed for a job at Microsoft in Stockholm. That time I went for five separate in-person interviews (I lived 500km from Stockholm at the time), one which included filling out a “personality test” that bore a resemblance to the OCA test. They (MSFT) still couldn’t make up their mind after the fifth interview so I cut it short.

But since I got an email from a famous guy I did consider working for MSFT once again and so I attended a couple of phone interviews. This time there was no personality test involved so either they got rid of that test or maybe it was just something used by Microsoft in Stockholm. This time I got to the third interview, the tech interview. I botched it on a couple of questions on swapping items in linked lists, and on QSort. Don’t know what I can blame that on but I guess it being 7am and me not being a morning person is one semi-valid excuse. Anyway, they were cool about it and said it had to do with budget constraints and not me messing up the linked list… :) Budget constraints at Microsoft? That’s like snow in Bangkok…

So no, I’m not going to become a ’softie. Anyway, the whole experience got me interested in knowing more about “Oslo” (mainly because Don Box is leading the Oslo team, making it highly likely that something cool has to come out of it), so I downloaded the Oslo SDK CTP when it was released last week.

Oslo includes a graphical modelling tool, Quadrant (which is not included in the CTP download), a text-based modelling language called M, some lower level languages for defining domain specific languages, a repository design pattern that appear to be aimed at ensuring the generated databases will play nice in their new “Cloud” services platform, called “Windows Azure” (with a worse nickname). The “Oslo” tools are still in their infancy so there’s a lot of stuff missing, but it is still interesting to play around with it.

If you are interested/active in model driven development, drift over to the msdn Oslo Developer Center,  download the SDK, take it for a test spin, browse the Oslo msdn forum and let them know what features you would like to see in Oslo.

Personally, I am hoping that this time around MSFT will release a modelling tool that [on the database modelling side]:

  • is at least as easy to use as Visio, and as flexible as ERStudio and ERWin together
  • supports all basic/common data modelling scenarios
  • will successfully be able to reverse-and-forward engineer at a minimum Northwind and AdventureWorks without losing any database objects
  • will handle versioning of data models and schemas; schemas evolve over time in any system so the ability to update an existing database with model changes is a fairly basic requirement in my opinion
  • will be flexible enough to support the different naming conventions commonly in use

Now I’m off to check if I got a fish or if the missing bottle/tackle has surfaced anywhere. If not, I’m going to switch to net-fishing next week.

2 September 2008

Tools - Part 5 - Add-ins - Documentation features in Entity Framework vs Linq-to-SQL

Filed under: Tools — Tags: , , , , — Kristofer @ 14:16

EF Designer - Documentation Features

A couple of weeks ago I mentioned that the ADO.NET Entity Framework designer has a new documentation field that allows entity types and members (classes and properties) to be documented right in the designer.

Documentation fields in the EF designer's property dialog

Documentation fields in the EF designer's property dialog

I think this is a useful feature - once descriptions have been typed in for all classes and properties, the code editor will show the descriptions in the tooltip for the relevant classes and properties.

Tooltip on an entity type in the Visual Studio code editor

It has a cosmetic flaw: In the haste to release VS2008 and .net 3.5 SP1 it seems like MSFT forgot to apply the entity type documentation to the entity set accessor properties, so entity sets (and all undocumented classes and properties) will show the default “There are no comments for [name] in the schema.“. I would like to see the EntitySet using the same documentation as the EntityType, or at least have a “Documentation” property of its own. (It does in the EDMX, but not in the designer)

EntitySet tooltip

The “There are no comments” default value is a bit redundant and waste of screen real-estate. Why not leave the default value blank? Oh well, this is just version one so I guess that will be improved in some future service pack update.

Also missing is the ability to automatically populate the documentation field with descriptions from data model diagrams and/or the database. The descriptions may already exist in a data model diagram or in the database. In addition, knowing what indexes exist is very useful when writing queries, and what indexes exist on tables etc are of course available from the database so being able to automatically retrieve those two pieces of information from the db would be a nice-to-have feature.

Linq-to-SQL documentation features

Being a fan of the more stable and faster OR mapper Linq-to-SQL, I would like to be able to have the same kind of inline/tooltip documentation for my L2S classes and members. However, the L2S designer does not yet have the “Documentation” field that the EF designer do. I wonder why…

Maybe MSFT will add the documentation fields to some future version of the L2S designer, and maybe they will also add coupling to build documentation from the database. I’m not going to hold my breath while waiting for this so instead I decided to add this as a new feature in the Huagati DBML Tools add-in for Visual Studio.

In brief, it works like this:

The add-in has a new menu option called “Update Linq-to-SQL documentation from database”:

New menu option for updating L2S documentation from the database

New menu option for updating L2S documentation from the database

When the new documentation feature is used, the add-in will retrieve table and column descriptions from the database, along with index definitions for all tables.

SQL Server Management Studio

The add-in updates the L2S generated code with summary xml comments and description attributes for all entities/classes, members/properties and data context entity accessor properties. I would have preferred to keep the documentation in a separate file, but unfortunately I don’t see a way to document member properties from separate partial classes so for now it is done inside the generated file. Not a big problem since the documentation can be re-generated at any time by just selecting the “update” menu option.

The end-result is that I can now see both descriptions and index information in-line while writing code against the L2S generated classes.

Voilà! Finally some nice descriptions and db index information right in the code editor.

Linq-to-SQL datacontext's table accessor property with tooltip

Linq-to-SQL generated entity type with tooltip

Tooltip for member property in Linq-to-SQL generated class 

Availability

The Linq-to-SQL documentation feature is available in version 1.20 of the Huagati DBML Tools add-in that can be downloaded from http://www.huagati.com/dbmltools/ . Users of previous versions can upgrade for free by downloading the latest version and installing it over the existing version. New users can download the add-in and get a trial license or a full license from the same page.

15 August 2008

Tools - Part 4 - Add-ins - Adding functionality to the Entity Framework EDMX designer

Filed under: Tools — Tags: , , , , — Kristofer @ 17:20

After playing around a bit with the ADO.NET Entity Framework designer in Visual Studio 2008 SP1 I was a bit disappointed with its’ user friendliness (or lack thereof). I guess this can be read between the lines in my previous entry on EF.

Instead of waiting for the next service pack for Visual Studio I figured it is more constructive to try to work around the shortcomings in Visual Studio by adding the missing functionality. So I started adding some edmx-related functionality to the Huagati DBML Tools add-in package. That immediately makes the original choice of name for the add-in a bad one, so I guess from now on it will be “Huagati DBML and EDMX tools”… :)

EDMX Addin #1: Renaming entity sets, entity types, properties, and navigation properties in EDMX (ADO.NET Entity Data Model) files

The first function, released today, is a renaming function that goes through all entity sets, entity types (classes), properties, and navigation properties in a EDMX file (EF designer) and renames them according to a user-selectable set of rules.

This is more or less similar too the dbml naming convention add-in but this one works with the Entity Framework designer instead of the Linq-to-SQL designer. It updates the conceptual and mapping side and also comes with a new shiny options dialog allowing users to specify what naming rules to apply. The options dialog also has a “preview” field for each section at the bottom of the screen that shows what the outcome will be on classes/properties in the currently open edmx file if using the selected settings.

Please allow me a screenshot-by-screenshot demonstration/walk-through:

1) Right after using the edmx import wizard, the conceptual layer objects; entity sets (accessor properties), entity types (the classes representing tables/views/etc), properties (fields/members) all use the names from the database. This is fine if the naming conventions used in the database match .net naming conventions, but in reality many people use other naming conventions in database schemas than in .net code for a variety of reasons.

Manually renaming all entity sets, entity types and properties to use .net naming conventions after running the EF import database wizard - for nice standardized names in the .net code - can be a lengthy exercise already on a medium sized data model.

The edmx designer for the HuagatiRes database after creating a new ADO.NET entity data model using the import wizard.

(Click on the screenshot for a larger view)

2) The add-in plugs into the Tools menu in Visual Studio, adding a new option “Standardize ADO.NET Entity Data Model class and member names” whenever the entity data model designer is open. (Long menu name, I know. Maybe I’ll rename it some day…)

The add-in plugs into the Tools menu in Visual Studio...

3) After selecting the “standardize…” tool, an options dialog will appear. This dialog allows the user to control whether to apply ProperCase (TitleCase) to names, remove underscores, pluralize accessors, remove suffixes/prefixes such as “tbl_”, “int_”, add new prefixes/suffixes etc etc.

Naming rule dialog...

(Click on the screenshot for a larger view)

4) ..and after hitting OK in the options dialog, the add-in will process all entity sets, entity types, properties, navigation properties etc and update the names according to the rules specified:

The Visual Studio 2008 EDMX designer after running the naming convention add-in

(Click on the screenshot for a larger view)

It is all pretty straightforward, but this function really saves a lot of time when importing database schemas into the Entity Framework designer. As the screenshot above shows it spared me from having to re-type (or rename) the 1777 classes and properties generated from the HuagatiRes database by hand, and instead automated the whole process and updated the entire conceptual layer in a matter of seconds.

This new add-in function along with other (yet to be released) EDMX / EF-related functionality comes packaged together with the DBML functionality in the Huagati DBML Tools add-in. If you want to try it out, you can download a trial version and get a trial license key from http://www.huagati.com/dbmltools/ , and if you like it I hope that you upgrade to either one of the paid-for versions to support continued development.

If you already have a license for the DBML Tools add-ins, you can upgrade by simply downloading and copying the latest version over your existing version and re-start Visual Studio.

13 August 2008

Application Architecture - Part 5a - Data Access Layer - A closer look at ADO.NET Entity Framework

Filed under: Architecture — Tags: , , , , — Kristofer @ 16:57

In two previous articles in the architecture section - Part 1 and Part 4, I mentioned that I intended to take a closer look at Microsoft’s Entity Framework once it has been RTM’d (released) in order to decide if ADO.NET EF or Linq-to-SQL is a better base for my DAL for HuagatiRes. Well, now that EF has been released as part of [the hastily RTM'd] Visual Studio 2008 SP1 and .net 3.5 SP1 I guess it is time to start playing around with it.

After several re-tries downloading the 831Mb VS2008SP1 image (all failed thanks to my ISP CSLoxinfo’s ‘invisible’ caching proxies), and getting past the installation quirks I finally got the service pack installed late last night. I quickly proceeded importing the HuagatiRes DB into an entity data model (EDMX file) and noticed some nice improvements in the EDMX designer over the Linq2SQL designer.

Improvements over the Linq to SQL designer

Some of the improvements that I first noticed in the designer:

  • The designer clearly identifies what relationships/associations/FKs are one-to-zero-or-one, one-to-one, one-to-many etc.
  • The navigation properties related to associations are clearly visible in the designer.
  • There is a model browser that plugs right into the same panel as the solution explorer that allows both the store and classes to be explored and looked up in the model.
  • It can even update the model from the database without having to resort to writing an add-in:)
  • Foreign key constraints behind associations are identified by name in the model browser and properties dialog.
  • There is a documentation field in the properties dialog where entities, members, associations etc can be documented.

So yes, the EDMX designer at a first glance has some improvements compared to the Linq2SQL designer.

Designer shortcomings

Improvements aside -  there are some missing features that at least I think would have been nice to have in the EDMX designer. Maybe they’re coming in a future version (SP1a, SP1b, SP2, …?)

  • Although the model explorer, and association properties, shows both FK constraint details as well as primary keys on tables, it does not show any indexes other than the PK. Why not both show the indexes [and index members] in the model explorer, and import index info and at least have attributes storing information about available indexes? Surely a more complete overview [including indexes] would be useful for all database developers. It would also be nice to see index info in the intellisense bubble when typing where clauses and joins in LINQ expressions, but of course that would require that the model (edmx and generated classes) include that information first. Maybe the EF team need to include a handful of enterprise developers in their advisory council
  • Progress dialogs for lengthy operations.

Let me elaborate a bit on this point:
I decided to take EF for a test-drive on a enterprise-system-size data model. The database I decided to use for this test-drive is not overly large, but large enough to fall in the enterprise-system category with its’ 1000+ tables and 2000+ associations.

I started generating the ADO.NET Entity Data Model (EDMX) from the database at 4PM. At first VS kept SQL Server very busy. The only feedback I got from VS was the hour glass mouse pointer, along with “(Not Responding)” in the title bar. There is no [visible] progress bar, no status messages or anything else to indicate how far it got.

Visual Studio hanging while generating an EDMX file for a 1000+ table database

Maybe I misinterpreted something, but when reading some article about EF somewhere on the intarweb I got the impression that EF was aimed at enterprise developers so I thought this would be an interesting test. But then again, maybe it is for enterprise developers with small databases / small data models.

 

…on to the last “missing-but-nice-to-have” items in the EDMX designer:

  • Naming convention support…  Oh well, I’ll add edmx support to a future version of the naming convention thingie in the DBML Tools add-in.
  • A way to import the table/column/fk human-readable descriptions/notes/documentation from data model diagrams (e.g. from Erwin .erx, ERStudio .dm1, or Visio for Enterprise Architects .vsd  files) into the description fields in the EDMX file. Well, there’s an idea for an add-in for Visual Studio. Maybe someone who doesn’t like re-typing the descriptions of every table, field and foreign key from the data model (like me) will write one…
  • Multi page / tabbed diagrams. A single diagram designer surface for the entire db is not really useful for schemas larger than Northwind. It looks like someone thought about it because the edmx file would support it although the designer don’t:
    <edmx:Diagrams>
          <Diagram Name=”HuagatiRes”>
                …     

 

Conclusion

The edmx designer has some nice improvements compared to the Linq-to-SQL designer. However, it lacks some basic features that would make it useful when working with schemas larger than Northwind.

I hope that Microsoft will take a look at data modeling tools to get some feature ideas for the next version / next service pack. Although the edmx and dbml designers are not intended to be used for data modelling, bringing some of the basic features that all data modelling tools have into the designers would really make them more useful during development.   …or why not even take it a step further and add the features needed to use these designers to model both the object model and the database schema?  That would eliminate the need for separate data modelling tools and could allow both the database schema and object model to be created from inside Visual Studio.

 

Coming up….: The next article in this series (5b) will re-visit the performance comparison between EF and Linq to SQL that Roger Jennings performed on the beta. I am hoping that the beta version was slowed down by debug code and that the RTM version will be faster / have less overhead.

 

Update: An hour later (two hours after starting the EDMX generation) I noticed that VS is no longer beating up SQL Server. Visual Studio is still hanging, but it appears to be stuck doing some internal work and using a full CPU to do whatever it is that is keeping it busy.

Visual Studio is still hanging after two hours with no UI feedback on what it is doing.

Update 2 (6 hours later): Visual Studio is still hanging, and I have to admit that trying to import the schema of a large database into a single ADO.NET Entity Data Model (EDMX) was a stupid experiment. It has to be divided up into smaller chunks for a) EF to be able to handle it, b) for maintainability, and c) for source control due to the single-EDMX-single-cs-file-per-data-model-approach.

Unfortunately chunking it up into many DALs leads to duplicated entity definitions for common/shared entities but at this point it appears that EF is not yet up to the challenge of supporting larger schemas.

Hopefully a future version of EF will support using and merging multiple designer surfaces with support for shared entities. Maybe by then it will even be able to use the information already available in the data model diagrams (erx/dm1/vsd) to divide up the schema in functional areas / subject areas.

For now guess it would be a good idea for Microsoft to add a “top 200″ or “top 400″ [or whatever is the limit for the EF designer] to the SQL Server catalog queries that the EDMX designer thing uses to import tables. Just to avoid having people try this kind of stuff…

6 hours later... 

Update 3 (16 hours later): The next morning VS was still hanging with no indication of how far it got and how much longer it would take. I decided it was time to terminate the experiment…

16 hours after the experiment started - terminating...

A closer look at the files in my project folder revealed a 8Mb+ edmx file along with a 1.75kb .cs file. I opened the edmx file using a more reliable/stable development tool; MS Visual Notepad and noticed that all entities, members, associations etc had been defined in the edmx. However, the diagram section was empty so I am guessing that Visual Studio spent all night on trying to decide how to layout the 1000+ files in a single designer surface.

The generated edmx file is without layout information

After restarting Visual Studio I tried adding the generated edmx file to an existing project. It got added but trying to open the designer resulted [again] in a hung Visual Studio (using 100% of one CPU core).

Update 4: I tried to generate the entity model using the command line tool for EF - edmgen.exe - hoping that it is better suited for large schemas than the edmx designer. I started it at 9:35am using the FullGeneration option. Unlike the designer/wizard in Visual Studio, EdmGen does output status messages.

 

C:\temp>edmgen /mode:fullgeneration /project:LargeDB /provider:System.Data.SqlClient
     /connectionstring:"server=(local);integrated security=true;database=largeenterprisedb"
Microsoft (R) EdmGen version 3.5.0.0
Copyright (C) 2008 Microsoft Corporation. All rights reserved.
Loading database information...
Writing ssdl file...
Creating conceptual layer from storage layer...
Writing msl file...
Writing csdl file...
Writing object layer file...
Writing views file...

Update 5: EdmGen crashed after 20-25 minutes.

edmgen crash

The console output simply stated “Process is terminated due to StackOverflowException.”

I rest my case. And I promise I will stick to small schema databases in my continued explorations of the ADO.NET Entity Framework.

 

Update 6: A Microsoft rep told me in the support forum that the EF designer is good for models with up to 120 tables. “Our response for the designer is that performance is typically reasonable up to about 120 tables, after which thinkgs begin slowing down.

http://forums.microsoft.com/Forums/ShowPost.aspx?PostID=3757588&SiteID=1

I should have asked before trying to use EF with a larger model… :)

5 August 2008

Tools - Part 3 - Add-ins - Writing optimized queries

Filed under: Tools — Tags: , , , — Kristofer @ 17:23

Every now and then I am commissioned to do database performance investigations for clients. This usually happens when they have a production system that has finally reached the point where users get frustrated, or - even worse - the system stops performing vital tasks because it is overloaded.

Configuration

There are a number of things that can cause database performance problems but there are a few causes I encounter much more frequently than others. Hardware configuration, OS configuration and database configuration are the first things I take a look at together with perfmon data to get an overview over where there may be bottlenecks. On Windows/SQL Server systems running large databases the most commonly overlooked or omitted configuration issues I encounter are:

  • Incorrect memory configuration - systems with >3Gb RAM without using the appropriate combination of /3Gb, /PAE, and /UserVA boot flags. SQL Server can make really good use of every little chunk of memory it is allowed to access so on some systems just setting these options so they match the environment have provided a “night and day difference”.
  • Less than optimal utilization of available disk bandwidth. Disk bandwidth is the most scarce resource on any database server so utilizing the available disk bandwidth by using the correct raid configuration, placing data files, transaction logs, tempdb on the right spindles etc can make a huge difference.
  • Non-vital services running on the database server; antivirus, secondary [test] instances of sql server, unused web servers, indexing services and other ‘resource thieves’.

All these are issues that should have been taken care of by a DBA or sys admin [if there is a dba or sys admin]. Either there hasn’t been a DBA involved in setting up systems plagued by these issues, or the DBA/sysadmin does not have the knowledge or resources needed to perform his/her duties, or the configuration has changed since the DBA was originally involved. Whatever the cause, these simple configuration issues are things that I have stumbled over on many production systems and a couple of changes and reboots later the systems are a lot snappier. I’m not going to delve into the details of these to not go too far off-topic, and besides there are so many articles and blogs entries covering these aspects already so there is no need to repeat it.   [However, if you need assistance with reviewing configuration on an existing system, feel free to contact me for a quotation :) ]

Bad queries

Once I have looked at the basic configuration settings [along with perfmon and profiler data to determine the optimum configuration for a specific system] I normally take a closer look at the user activity on the system, i.e. queries, connections etc from the various client- and reporting applications typically involved. This is the area where I frequently find the worst offenders. Even a single report query, innocently added to address some business need, have been proved over and over to bring otherwise optimally performing databases on its’ knees (or even below the floorboards).

In a perfect world, which applications should have access to production databases and all queries allowed into a production system are strictly controlled and everything new is thoroughly reviewed before being allowed into a production environment. In the real world, business requirements, time- and resource constraints tend to make individuals and organizations break these rules. So in the real world, where do the queries that run against production databases originate from? A few common sources are:

  • Code written by programmers that understand how the database works, and how a query will be executed usually take care to write good, optimal queries that have been thoroughly tested and reviewed to ensure that criteria and groupings use the right indexes, that joins will be picked up in the right order by the optimizer, that the necessary indexes exist in the first place, that the overall execution plan is optimal etc. These guys rarely cause a problem, and when they do it is usually because of time constraints that cause them to write sloppy code.
  • Code written by programmers that don’t understand how the database works, and don’t have the interest, time, or ability to gain that knowledge/understanding. Some of the people that fall in this category know their own limitations and get their work reviewed, others don’t and can - with a few lines of code - bring disaster into a system that is otherwise in harmony with its’ enviroments. Some organizations have mechanisms in place to catch the work of this group - code reviews, test cycles with dba reviews etc while others don’t.
  • Business intelligence / custom reporting. I’m not disputing the importance of these tools to address business needs, but the easier these tools become to use for laymen, the more dangerous they [potentially] become to the systems they interact with. Again, in the ‘perfect world’ scenario, all BI and custom reporting are kept on separate systems - whether log-shipped or replicated copies of the production database or separate data warehouse databases with pre-aggregated data -  well away from the transactional production systems. In the ‘real world’ they sometimes interact directly with the production system database(s) where a seemingly innocent query dragged-and-dropped together in a query builder tool can bring a system on its knees.
  • Add-on web interfaces; web pages or web services - not part of the core system - that provide customers, suppliers or other external parties access to certain kinds of information from otherwise internal systems. These tend to be written by in-house IT department developers to add functionality to systems provided by ISVs, and if the people writing them either fall in category #2 or don’t have enough knowledge about the system they are interfacing with these seemingly innocent on-the-sidelines mini apps can in the real world be the root cause of serious performance problems.

Tools

If you have made it this far, you are probably asking yourself: why this rant in an article in the tools category?

The simple answer: addressing the problem of poorly written queries early on in a development cycle makes it easier to avoid them altogether. And the best way to avoid them is by using the right kind of tools.

Linq to SQL out of the box gives us type safety and intellisense when writing queries, which vastly improves productivity when writing code that interacts with the database. It makes the queries known to the compiler already at compile time in the vast majority of cases. [dynamic queries excluded].

In the past, bad queries had to be caught after they were written; in code reviews, dba reviews, using profiling tools such as SQL Server Profiler on test- and production systems etc. With Linq to SQL this is even more true when using the out-of-the box tools. In fact, the main argument used by many Linq to SQL detractors is that it takes control over query composition and “hides” the queries away from programmers and DBAs. This is true when using the out-of-the-box [first] version of Linq2SQL (and EF), but that doesn’t mean it have to be that way, and it does not necessarily mean that taking SQL composition away from the programmers or DBAs have to be a negative thing.

So, to address this problem I am working on a couple of additional tools that will plug right into Visual Studio together with the Huagati DBML Tools add-ins and add some new functionality.

Tool #3: DBML Explorer

I will start the numbering with #3 as there are already two tools included in the Huagati DBML Tools toolkit. The DBML Explorer will be the third tool to be added to the toolkit.

Today, Visual Studio gives programmers access to the database objects through the Server Explorer sidebar, and access to the ORM side of it through the class designer(s) and properties dialog. The properties dialog also shows some basic database attributes such as table name, column data types etc. However, neither of the two gives a unified view over both, and on top of that neither are particularly easy to navigate for large databases.

The DBML Explorer is intended to overcome this and bridge the gap. It combines the information from the ORM side (DBML file / Linq2SQL designer) with schema information; table details, column details, index information, foreign keys etc. With this information at hand directly in Visual Studio it will be easier to write Linq queries against the database while being aware of what indexes are available for the query, comparative size of tables etc.

Preview Screenshot of the DBML Explorer

Preview Screenshot of the DBML Explorer

Tool #4: DBML Query Builder

Although I don’t like using query builder tools myself - I actually enjoy  writing SQL (and nowadays LINQ) queries - I can see the productivity boost that a query builder tool can bring to those who are not familiar or comfortable with query language syntax. This tool extends the DBML Explorer with functionality to quickly select the relevant tables and columns, join types, groupings/aggregates, criteria, formulas and then let the tool compose the Linq query with on-screen advice on index usage etc. This tool will help beginners get up to speed with LINQ syntax as well as save typing for those who are not amused by writing queries.

There are already some Linq query builder tools out there, but what I intend to address with this one is how the final query will interact with the database. If it will end up doing a table scan on a large table [or several large tables] this should be visualized already while the query is being composed in the query builder.

Tool #5: LINQ Query Analyzer

Since Linq has already given us productivity enhancing features such as type safety and intellisense when writing queries, why not take it a step further? Since the query is no longer stuffed away in a string literal but is now part of the code, some basic things; index utilization, join order and other parts of the execution plan can be checked early on, inside Visual Studio, any time between the query is written and compiled. This means that bad queries can be caught already at design time in Visual Studio instead of later in SQL Server Profiler while troubleshooting performance issues in a production environment.

Preview Screenshot of the Linq2SQL Query Analyzer output in Visual Studio

Preview Screenshot of the Linq2SQL Query Analyzer output in Visual Studio

 

Availability of the tools

These additional tools will be included in the Huagati DBML Tools package as they become ready to use. The DBML Explorer is first up and it is scheduled to be released later this week. The other two tools don’t have a release date yet but I will update this post when the date is known.

Existing license holders

Existing license holders of the tools will be able to upgrade to a new version including the additional tools free of charge.

 

Note: This posting is provided “AS IS” with no warranties, and confers no rights.

17 July 2008

Tools - Part 2 - Add-ins - DBML synchronization add-in updated

Filed under: Tools — Tags: , , — Kristofer @ 18:45

This is just a short release notification covering the latest updates to the DBML refresh add-in.

The add-in now detects the following database changes and will apply those to the dbml:

  • Added columns
  • Dropped columns
  • Dropped tables
  • Primary key changes
  • Auto generated column changes
  • Data type changes
  • Added foreign keys
  • Dropped foreign keys [new feature added on 22 July 2008]
  • New tables [new feature added on 23 July 2008]

Still missing (but coming in a future version) are:

  • New tables [feature was added on 23 July 2008]
  • New, modified or dropped stored procedures
  • Dropped foreign keys [feature was added on 22 July 2008]

The add-in is described in more detail in the “Tools Part 1″ article and can be downloaded from http://www.huagati.com/dbmltools/

 

The current version is 1.0.3120.33307, dated 17 June 2008 18:30.

Updated on 22 July 2008: Added support for dropped foreign keys and fixed a couple of issues related to recursive FKs (referencing the same table). The current version is 1.0.3125.30798 dated 22 July 2008 17:06.

Update: 23 July 2008 - added support for new tables, fixed a pile of reported and unreported bugs and re-tested against the reservation db and Northwind. The current version is 1.0.3126.38442 timestamped 23 July 2008 21:21.

Update: 26 July 2008 - numerous fixes and adjustments. Added licensing model; the free edition now supports up to 20 tables per DBML file. For larger DBML files, a license can be acquired from http://www.huagati.com/dbmltools/

Update: 30 July 2008 - added options dialog allowing the user to select what parts to synchronize (new/dropped tables, new/dropped/modified columns, PKs, new/dropped FKs). These options can be persisted along with a table exclusion list for unwanted tables. Also added a user guide detailing how to install and use the add-ins. Updates can be downloaded from http://www.huagati.com/dbmltools/

14 July 2008

Application Architecture - Part 4 - Data Access Layer - LINQ and Linq2SQL Links

Filed under: Architecture — Kristofer @ 12:57

Linq2SQL, groupings and aggregates

In my first entry on Linq2SQL almost a month ago I listed a few of the “childhood diseases” in Linq2SQL. One of those is that grouped queries with multiple aggregates are less than optimal; the Linq2SQL provider translates such queries into one subquery for each aggregate resulting in poor I/O statistics. A few months back when I originally stumbled upon that problem while working on another LINQ2SQL based project, Hedgehog, I did some whining on the same subject in the msdn forum for the Linq project and I also submitted a suggestion on Microsoft Connect hoping for a better/more efficient Linq-to-SQL query translation in .net 3.5 SP1. The suggestion on Microsoft Connect remains untouched and I never heard anything more in the msdn forum so the workaround for now has been to keep that kind of queries in stored procedures.

That was until today when I came across the latest entry in Matt Warren’s blog. Matt has helped me out with Linq2SQL questions several times before, both in the Linq2SQL msdn forum and through his blog. In his latest blog entry, Matt covers the topic of groupings and aggregates, and he solves the problem by implementing a better IQueryable provider that supports multiple aggregates without resorting to sub queries. Very elegant, Matt!

Read more on Matt’s IQueryable provider in his blog:

Now I am really hoping that Matt’s solution will be included in .net 3.5 SP1 RTM or SP 1½ if it is too late to include any new stuff in SP1. Although I think fixes, improvements, and optimizations to already released functionality (Linq2SQL) should have higher priority in the service pack than brand new functionality such as EF that shouldn’t really be part of a service pack. But that’s Microsoft’s call and is probably more controlled by politics and the ongoing controversy surrounding Linq2SQL vs EF.

Comparison of Linq2SQL and Entity Framework

Roger Jennings of OakLeaf Systems just published a comparison of Linq2SQL and Entity Framework on his blog. He compares code size, memory footprint and more importantly initialization and roundtrip times against an empty database to see if there are any bottlenecks in the ‘plumbing’. For his tests he used my first draft datamodel as a base for generating entity classes, and he also covered some of the other content in this blog.

Roger is a consultant and author with 30+ books behind him. His books cover Microsoft technology from operating systems to databases, development etc and has been translated to 20+ languages. I’m of course flattered and humbled that Roger posted a reference to my blog, and I am happy to see that he is basing the Linq2SQL vs Entity Framework comparison on my data model. Thanks Roger!

The most interesting finding in Roger’s Linq2SQL vs EF comparison is that EF not only has a larger footprint (as expected) but it shows significantly slower roundtrips against an empty database. This is something that needs to be investigated in more detail to reveal the underlying reasons.

Read Roger’s article and his findings over at:

I haven’t yet looked into the details or potential reasons behind the huge difference between Linq2SQL and EF roundtrip times that Roger unveiled, as I have previously decided to stick with Linq2SQL until EF is at least RTMed. However, I will certainly try to repeat Roger’s comparison on EF once RTMed, in addition to a closer look at what EF throws at the database with regards to query composition and how that translates to usage of I/O capacity, CPU, locks and other database server resources.

Personally I’m quite happy with the one-to-one mapping between database tables and Linq2SQL entities, I know [by now] reasonably well how Linq2SQL interacts with SQL Server, and whenever I need further abstraction from the data model I accomplish that by writing new classes on top of the ones generated by Linq2SQL. If EF does not perform and scale at least as well as Linq2SQL then I will probably stick to [the more lightweight] Linq2SQL in combination with Matt Warren’s improved IQueryable provider.

Older Posts »

Powered by WordPress