Connecting to SQL Server on Azure with Python

written by Henrik Blidh on 2016-10-01

I recently had some dealings with an SQL Server instance on Azure, which I wanted connect to from a Linux (Ubuntu 14.04) computer. It took longer than I would have liked, and I thought it might be interesting to share the experience. All code can be gotten from this Gist.

There is a Microsoft tutorial which covers using the pymssql module directly, but I wanted to use SQLAlchemy with pymssql as database driver instead. Even more, I wanted to use the Records: SQL for Humans™ module written by Kenneth Reitz as well.

Setting up

So, first off I created an example database at Azure (see this tutorial for that) and populated it with the Microsoft AdventureWorks sample data. After that I set out to install and get pymssql to work.

It required the FreeTDS libraries for communicating with SQL Server, so those packages were installed first:

$ sudo apt-get install freetds-dev freetds-bin

After this a regular pip install worked fine, for all three modules I am interested in:

$ pip install pymssql sqlalchemy records

Connecting directly with pymssql

So the work to be done was to structure all database connection data in separate variables, find a query to run, set up the database connection and then execute it:

import pymssql

database_server_name = 'nedomkulltest'
mssql_host = '{0}.database.windows.net'.format(database_server_name)
mssql_port = 1433
mssql_user = 'dbadmin'
mssql_pwd = 'password'
mssql_db = 'exampledb'
mssql_driver = 'pymssql'

sql_query = """
    SELECT c.CustomerID,
           c.CompanyName,
           COUNT(soh.SalesOrderID) AS OrderCount
    FROM SalesLT.Customer AS c
    LEFT OUTER JOIN SalesLT.SalesOrderHeader AS soh ON c.CustomerID = soh.CustomerID
    GROUP BY c.CustomerID,
             c.CompanyName
    ORDER BY OrderCount DESC;
"""

conn = pymssql.connect(
    server=mssql_host,
    port=mssql_port,
    user='{0}@{1}'.format(mssql_user, database_server_name),
    password=mssql_pwd,
    database=mssql_db)

cursor = conn.cursor()
cursor.execute(sql_query)

while cursor.fetchone():
    print(row)

An exception is raised; not much of a surprise there:

Traceback (most recent call last):
  File "/home/hbldh/test_script.py", line 70, in <module>
    database=mssql_db)
  File "pymssql.pyx", line 641, in pymssql.connect (pymssql.c:10788)
pymssql.OperationalError: (40615, "Cannot open server 'nedomkulltest' requested by the login. Client with IP address '83.253.180.11' is not allowed to access the server.  To enable access, use the Windows Azure Management Portal or run sp_set_firewall_rule on the master database to create a firewall rule for this IP address or address range.  It may take up to five minutes for this change to take effect.DB-Lib error message 40615, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed\n")

All right, adding my IP to the database server's whitelist and trying again. I actually got another error here the first time I tested, but I am unable to recreate it for this blog post. It was a problem of incompatible TDS versions and is described in pymssql's FAQ for communicating with the SQL Server 2012 that is on Azure. By doing a sudo nano /etc/freetds/freetds.conf and adding your server with TDS version 4.0, 7.2 or 8.0, this can be remedied.

After this brief prelude, the code above prints out a lot of rows:

(29485, u'Professional Sales and Service', 1)
(29531, u'Remarkable Bike Store', 1)
(29546, u'Bulk Discount Store', 1)
(29584, u'Futuristic Bikes', 1)
[...]
(696, u'Roadway Supplies', 0)
(697, u'Brakes and Gears', 0)
(698, u'Western Bike Supplies', 0)
(701, u'Future Bikes', 0)

Connecting with SQLAlchemy

It works! After reading up on SQLAlchemy's MSSQL Connection String format for pymssql and doing some trial-and-error coding, I managed to create a valid connection string and with that initiating a SQLAlchemy engine and making the same query as above:

connection_string = 'mssql+{0}://{1}:{2}@{3}:{4}/{5}'.format(
    mssql_driver,
    '{0}@{1}'.format(mssql_user, database_server_name),
    mssql_pwd,
    mssql_host,
    mssql_port,
    mssql_db)
print(connection_string)

from sqlalchemy import create_engine

engine = create_engine(connection_string)
results = engine.execute(sql_query)

for r in results:
    print(r)

Identical output, as expected, albeit with the connection string printed out as well:

mssql+pymssql://dbadmin@nedomkulltest:password@nedomkulltest.database.windows.net:1433/exampledb

Connecting with Records

Almost done. Finally, let's try Records as well:

import records

db = records.Database(connection_string)
rows = db.query(sql_query)

for r in rows:
    print(r)

Conclusion

With the two lines of code for creating a database connection and then making the query I have reached the level of brevity which I desired. It is to the SQLalchemy code (Records use it behind the scenes), but with the distinction that you don't create an engine but rather a more aptly named Database instance.

Records also uses the eminent Tablib module, which enables exporting the SQL data you just fetched into a multitude of other data representations with just one additional line. This last thing is something I find myself doing quite a lot, and it was that issue that inspired this blog post. Just by writing rows.dataset.json you get the query data converted to beautiful Python dicts and ready to be stored in e.g. a NoSQL database of your choice!

I wholeheartedly agree with Kenneth Reitz on the idea of SQL for Humans; making a few calls to a SQL database should be as simple and self-evident as in the Records case. Casting that data to CSV, JSON, Excel or some other representation should be done as easily.

Collected code

The embedded code below can be gotten from this Gist.