User Group Leaders: Who Are Your Members?

crowdAt the MVP Summit last week, I was talking with fellow Dallas-area tweep Sean McCown about our local SQL Server user group membership.  I think our group is unique because of its sheer size; we typically have between 70 and 90 people at our monthly meetings, with a mailing list that goes out to between 600 and 800 people.  So the question came up, whom should we consider as members?  Should members include anyone on our list, or only the regular attendees?

Technical communities, by nature, have fuzzy edges.  People migrate in and out of them all the time, with a very small core constituency and much larger group of folks with varying degrees of participation at any given time.  There’s nothing wrong with that; people participate as they have time, opportunity, and desire to do so, and those things can vary greatly in a person’s life.  The transient nature of these groups tends to be even more profound with larger groups, since it’s easier for individuals to slip in and out with relative anonymity.  To further complicate things, the intertoobz make it very easy for individuals to participate online with user communities worldwide.  I’m a perfect example of this; I am a “member” of two dozen or so local SQL Server user groups from coast to coast, and I participate in their mailing lists even though I’ve never physically attended their meetings.  So as you can tell, the question of who is and is not a member of a given user group can be difficult to answer.

Why Does It Matter?

For some groups, it doesn’t.  In smaller, less-formal user groups, there is very little overhead, and the question of group membership is rather trivial.  But in larger or more active groups, the issue of whom to include as an official group member can be more than just an academic metric.  Larger groups frequently have officers or board members to elect, money to collect and manage, events to run, rules to enforce, and perhaps even legal matters to handle.  To oversee all of that, there must be a clear way to identify which members of the community have a vote or are eligible for certain positions or responsibilities.

How Is It Done?

I’m sure it’s a little different for each group, but for the Dallas SQL Server user group we take a very open approach.  We consider anyone who is part of our mailing list to be a group member, with full voting rights.  We don’t have issues that require voting very often (normally just the yearly board elections) and have found this approach to be the most effective for us.  Further, we don’t arbitrarily restrict who may run for our local board, as the size of our group tends to make this a self-governing process – it’s highly unlikely that someone could walk in and get elected to the board without first having demonstrated some leadership of and loyalty to the group.

How Do YOU Do It?

So now that I’ve shared how we handle our SQL Server user group membership, I’d like to hear from other user group leaders to find out what others are doing.  Do you simply use your open mailing list, or do you have a more formal identification process?  Do you charge an arbitrary membership fee to see apart the members from everybody else?  Any pitfalls or successes you’d care to share in the approach you use?  Feel free to ping me offline if you’d rather share these details privately.

The PASS Acquisition of SQL Saturday

batonA couple of weeks ago, it was announced that the SQL Saturday franchise was voluntarily transferred to PASS.  This change of ownership could be a good thing for SQL Saturday, but I do have some questions and concerns.

Let me say for starters that the SQL Saturday franchise is near and dear to me, since it was one of the early events (SQL Saturday #3 in Tampa) that helped me get my start as a technical presenter, which has led to a lot of other opportunities since.  Since then I’ve attended and spoken at several other SQL Saturday events, I am part of the team putting together an event in Dallas this summer, and I’m planning to attend at least two more out-of-town SQL Saturday events this year.  As result of my involvement, I’ve come to appreciate both the mission and the implementation of this framework.  I’m also friends with Andy Warren, Steve Jones, and Brian Knight, the founders of SQL Saturday, and I’ve spent a lot of time talking with Andy about these events, their value in the community, and how they could be sustained and even improved.  With all that said, it’s a fair assessment that I’m more than just a casual observer to all things related to SQL Saturday.

The Good

Because PASS is such a large organization with a larger pool of potential resources, the change of ownership could be a good thing for the future of the SQL Saturday brand.  With more volunteers to draw from, a full-time administrative staff, and of course the PASS name and rather large megaphone, the possibility exists to grow the already-successful franchise into a strong and ubiquitous series of localized events.  There is a good deal of content overlap between the functions of PASS and the SQL Saturday events, and aligning those goals into a consolidated effort has the potential to improve both entities.

The Risk

There is, however, risk in this change.  The most key issue for me is the possibility that the management and implementation model will be changed.  SQL Saturday has already established a strong record of success in its short history through a ground-up, grassroots approach.  Andy specifically built this brand to be a framework and not a management hierarchy; as such, the local user groups were given an immense amount of latitude on the details of the implementation of these events.  There were very few rules that constrained the use of the SQL Saturday brand, and as a result, I think the local group leaders and volunteers felt a strong sense of ownership over the process.  If they own it and believe in it, they’re going to pour themselves into it.  The risk lies in the potential changes that PASS could make to integrate SQL Saturday into its existing infrastructure.  I’m hopeful to find out answers to the following:

  • Will PASS try to take a stronger role in running the local events, and if so, to what extent?  Specifically, is the local user group leadership in charge, or will the events be run by PASS?
  • Will there be a long checklist of boundaries and constraints on the details of the implementation? 
  • Can we still give first-time speakers the opportunity to speak, or will there be a qualification process that excludes those who have never given a technical presentation before? 
  • How are finances (sponsorship monies and event expenditures) handled?
  • Can local groups still raise sponsorship funds locally, or are we locked into those sponsors provided and approved by PASS?
  • How will the SQL Saturday mission be integrated into the current PASS initiatives, and will it be changed to accommodate same (or vice versa)?
  • Will the name be changed?  (A lesser concern, but a name change could adversely affect the already strong name recognition)

I know that all of these questions don’t have definite answers yet – after all, the ink is barely dry on the paperwork – so I’m willing to be patient for answers until the dust settles :)

Looking Forward

I hope I don’t sound pessimistic, because I’m not; to the contrary, I’m excited about what this could do for SQL Saturday.  If I can offer any advice to the decision makers, it would be this:  Don’t try to change the event too much.  Yes, brand it as a PASS event, and offer whatever resources (personnel, cash/merchandise, promotion and marketing) that can be spared for the event.  But we can all (hopefully) agree that SQL Saturday has been highly successful, especially when you consider the brief time it has been in existence.  I also know that Andy, Brian, and Steve have put in a lot of work on these events, and they wouldn’t give it away without some assurance that the brand is in good hands.

As things shake out, I’ll be sure to share any answers I get along with my analysis of same.

A New Season (A Networking Success Story)

I’d like to share a networking success story.  Last year, I blogged about my experience at the PASS Summit of 2005, where I was essentially a wallflower and didn’t really do any networking.  Since then, I’ve realized its importance and have embraced professional networking as a key component in a successful career. 

Ever since then I have carried through on the lesson I learned, spending as much time as possible getting to know my colleagues, and lending them a hand whenever possible.  At the PASS Summit this past November, I got the chance to redeem myself from the lack of initiative from my trip four years earlier, and took the opportunity to get to know as many people as I could.  During lunch on the third day of the Summit, I met a fellow Dallas-area business intelligence professional who works for a small consulting firm in my area.  He mentioned that his company was looking to hire one or two more senior BI people, and I hinted that I was considering making a move.

To make a long story short, that encounter led to a few phone calls and a series of meetings with this company, and as of next week, I will be a permanent part of their team!  My new role at Artis Consulting will be as a business intelligence consultant, solving complex business data problems alongside some very sharp coworkers.  I’ve had the opportunity to spend a little time with all of the leadership and several of the staff members, and I’m very excited about this move and the new challenges that it will bring.

So back to the success story… Looking back at the events of the past couple of months, I don’t believe things would have ended up this way without the groundwork I laid through networking.  In the last few years, I’ve spent a good deal of time working with and getting to know the folks in my local SQL Server user group, which in part led to my leadership role within that group.  That leadership position helped me to meet and develop friendships with other SQL Server group leaders, and one of those relationships led directly to a friendly introduction to my initial contact at Artis, resulting in the interviews and eventually the new career with that company.  It’s important to note that my new role at this company was not openly advertised as a vacant position, so I would likely not have found this opportunity through a traditional job search.  I do believe that there was a greater comfort level on both sides of the interviewing fence after we came together through a known and trusted common contact.

My recent experience is further proof that building professional relationships through networking is a great strategy for career improvement.  If you’re like I used to be – introverted, a bit shy, perhaps doubting the value of professional networking – I encourage you to take a chance and get to know some of your peers and colleagues.  Find a local user group in your area of expertise, and set a goal to meet X number of people.  Attend a local technology event such as a product launch or a SQL Saturday, and introduce yourself to others there.  Invite a colleague you don’t know to lunch or coffee.  Volunteer to be part of a team in events such as GiveCamp.  There’s nothing to lose!  The very worst thing that can happen is that you’ll meet some people you’ll never see again.  And often, things work out such that your networking contacts work together to change your career for the better.

“Fortune favors the bold.”  — Virgil

SSIS Alpha Splits using the CODEPOINT() Function

A relatively common requirement in ETL processing is to break records into disparate outputs based on an alphabetical split on a range of letters.  A practical example of this would be a work queue for collections staff based on last name; records would be pulled from a common source and then separated into multiple outputs based on a the Customer Last Name field, with the resulting output going to the person or group responsible for working that alphabetical subset of data.

There are a couple of different ways you can do this.  First is to use separate sources for each range of characters, and specify in your SELECT statement only those values that you want.  This is an effective quick-and-dirty option, but it doesn’t scale well as it requires multiple round trips to the database.  You could also accomplish this task using a simple text comparison for each letter of the alphabet, but this method is a typing-intensive operation.  For example, let’s say you want to group together the records for customers whose last names falls in the A-F range.  Using the Conditional Split transformation, your A-F output expression would look something like the following:
 

SUBSTRING(UPPER(LastName), 1, 1) == "A"
|| SUBSTRING(UPPER(LastName), 1, 1) == "B"
|| SUBSTRING(UPPER(LastName), 1, 1) == "C"
|| SUBSTRING(UPPER(LastName), 1, 1) == "D"
|| SUBSTRING(UPPER(LastName), 1, 1) == "E"
|| SUBSTRING(UPPER(LastName), 1, 1) == "F"

Your other groups would contain a similar statement to explicitly define each letter to be included in the group.  Not a complex operation, but one that requires a lot of typing.

 

An Easier Way

An easier way to do this is to use the relatively obscure CODEPOINT() function.  This method, which is part of the SSIS expression language, returns the numerical Unicode decimal value of the leftmost character of the input string.  The above grouping would be rewritten as follows using the CODEPOINT() function:

CODEPOINT(UPPER(LastName)) >= 65
&& CODEPOINT(UPPER(LastName)) <= 70

The difference is, rather than enumerating each possible starting letter within the range, I’m now evaluating the Unicode value of the first character in my LastName text field, and only including those in the 65 to 70 range (A through F inclusive) in this output.  I’ve saved myself a little typing, and this approach is easier to maintain and troubleshoot in my opinion.  A sample conditional split with four groupings is shown below:

screen1

 

Take It Up A Notch

So you might ask, “That’s great, smart guy, but why go through this just to save myself maybe 5 minutes of typing?”.  I’m glad you asked!  Let’s take our example a little bit further and assume we’re breaking these groupings down into smaller units.  Consider the possibility that, rather than grouping last names together based on the first letter of the last name, we’ve got a sufficient number of outputs that we’re now splitting the records within that first letter; for example, if we were to split the data stream where the last name starts with an M, we might slice our outputs on those starting with MA to MI, then MJ to MR, and finally MS to MZ.  By using the direct comparison method described above, our fully configured conditional split could have up to 26^2 possible permutations, which means we’ve got to do 676 comparisons (assuming all uppercase alpha characters) within the conditional split transformation, which will likely impact your package performance, not to mention the immense amount of typing required to set this up.  Fortunately, some creative use of the CODEPOINT() function can simplify this ETL requirement.

For this example, let’s assume that we need to separate our records within the letter M into three distinct groups as mentioned earlier, since statistically there are a lot of last names beginning with M.  For each “M” output, I’m going to use an direct string comparison to verify that the first letter is an M (since we’re looking for a single match and not a range in the first character), and second, I’ll use CODEPOINT() in conjunction with the SUBSTRING() function to check that the second letter falls within the expected range for each output. 

So for our first M grouping, the MA to MI group, the following expression would be used:

SUBSTRING(UPPER(LastName), 1, 1) == "M" 
&& (
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) >= 65
&&
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) <= 73)

The code above will match records where the first letter is a literal M, and the second character is between A (Unicode 65) and I (Unicode 73) inclusive. Similarly, the MJ to MR expression reads as such:
 

SUBSTRING(UPPER(LastName), 1, 1) == "M" 
&& (
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) >= 74
&&
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) <= 82)

And finally, the MS to MZ expression:

SUBSTRING(UPPER(LastName), 1, 1) == "M" 
&& (
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) >= 83
&&
CODEPOINT(SUBSTRING(UPPER(LastName), 2, 1)) <= 90)

 

The partially configured conditional split transformation would look similar to the following:

screen2

So you can see that you’ve still got a small chunk of code to write (or copy/paste and modify) for each of your outputs, but it’s far less trouble – and better performing, no doubt – than enumerating all of the possible combinations of the first two letters of the LastName field.  The further you go into the string for your split (for example, breaking all the way down to split “McA” to “McF”, “McG” to “McN”, etc.), the more significant your efficiency in using this method over direct comparisons.

One caveat that bears mentioning: You’ll notice that I’ve used the UPPER() function generously in these examples.  The reason for this is twofold: First, a direct string comparison in the SSIS expression language is case sensitive; for example, “M” does not equal “m”.  Second, the same holds true for the Unicode decimal values returned by CODEPOINT().  Uppercase M, or Unicode value 77, does not equal lowercase m, or Unicode value 109.  Use of the UPPER() function helps to ensure that we’re making accurate comparisons regardless of case.


Conclusion

The CODEPOINT() is a rarely used function in the SSIS expression language, but can be an effective tool in your ETL arsenal in some cases.  For alphabetical grouping or splitting of records, it’s a very handy function that helps to reduce a lot of typing at design time.

More information about the CODEPOINT() function can be found at this page on MSDN.