DB4S Plot

This post is the second in a two-part series on setting up and using SQLite at home (read Part 1 here). I was inspired to address this topic due to the number of posts on the Codecademy Forums regarding using SQLite locally on their own computers. Although I feel Codecademy is a great way to get started learning a new programming language, one of their weak points has always been transitioning their learners to coding offline. Hopefully these posts will serve as a definitive reference for getting beginners set up to explore and interact with SQLite databases on their own computers.

In my previous post, Part 1: CLI, I covered installing SQLite and setting it up for use on the command line. However, as I mentioned in that post, the command line is only one of the four ways I tend to interact with SQLite on my own computer — and it is the option I am least likely to go for if I need to take a close look at a dataset. In fact, I really only use the SQLite CLI if I am creating a database/table or if I am using it in a program or app. If my task involves data exploration to any degree, I find the CLI less efficient and frankly, less visually appealing. In this post I'll dive into the other three methods I use to interact with SQLite locally and why I believe they are more intuitive to use, easier on the eyes and provide better workflow integration than the command line.

To skip directly to a specific section click the link below:

  1. Command Line (previous article)
  2. Python/Jupyter Notebooks
  3. Visual Studio Code
  4. DB Browser

2. Python/Pandas/Jupyter Notebooks: The data science workflow

SQLite in Jupyter

Jupyter Notebooks are practically synonymous with data science, and for good reason — they make it simple to run small chunks of code in any order and see the results in-line. This makes Jupyter Notebooks perfect for exploratory data analysis in Python, but many people don't realize Notebooks provide an excellent interface for querying a SQL database as well. The key reason is Jupyter's default HTML rendering of Pandas DataFrames. By returning your SQL query result as a DataFrame, you get all the information in a clean, readable format without ever having to leave your Python environment.

This is my preferred method of exploring a SQLite database when I need to use the data as part of a larger Python data analysis project. It fits seamlessly into the rest of my project's workflow, and allows me to concentrate on the task at hand instead of having to jump back and forth between multiple windows.

Using SQLite with Python/Jupyter Notebooks

If you don't have Jupyter installed yet, download Anaconda or follow the installation instructions on the Jupyter website. I highly recommend installing via Anaconda, as it comes with jupyter, sqlite3 and pandas by default.

Once you have a new Jupyter Notebook open, you can start using SQLite with Python by importing sqlite3 and pandas in the first cell of your notebook.

In [1]:
import sqlite3
import pandas as pd

Technically, you can use SQLite in your notebook without Pandas, thanks to Python's standard sqlite3 module. If you wanted to do so, you might start with the code below, which does the following:

  • Creates a connection to the database (as in my last article, I am using the database found here for demonstration purposes)
  • Creates a cursor;
  • Executes a query to find the tables in our database; and
  • Prints the names of those tables
In [2]:
conn = sqlite3.connect('chinook.db')

c = conn.cursor()

# Note: I prefer using multi-line strings for query readability; compare with the string below: 
# 'SELECT name FROM sqlite_master WHERE type in ("table","view") AND name NOT LIKE "sqlite_%" ORDER BY 1;'
c.execute('''SELECT name 
             FROM sqlite_master 
             WHERE type IN ("table","view") AND name NOT LIKE "sqlite_%"
             ORDER BY 1;''')

print(c.fetchall())
[('albums',), ('artists',), ('customers',), ('employees',), ('genres',), ('invoice_items',), ('invoices',), ('media_types',), ('playlist_track',), ('playlists',), ('tracks',)]

In the printed output we are able to see all the table names, but the result isn't very visually appealing. While you might be able to overlook it for this query, what if you want to check out the schema for all of your tables?

The cell below shows the normal Python output of a query for the schema of all the tables in our database. As you can see, it is not only ugly, it is fairly difficult to read.

In [3]:
c.execute('''SELECT sql 
             FROM sqlite_master
             ORDER BY tbl_name, type DESC, name;''')
print(c.fetchall())
[('CREATE TABLE "albums"\r\n(\r\n    [AlbumId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Title] NVARCHAR(160)  NOT NULL,\r\n    [ArtistId] INTEGER  NOT NULL,\r\n    FOREIGN KEY ([ArtistId]) REFERENCES "artists" ([ArtistId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_AlbumArtistId] ON "albums" ([ArtistId])',), ('CREATE TABLE "artists"\r\n(\r\n    [ArtistId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Name] NVARCHAR(120)\r\n)',), ('CREATE TABLE "customers"\r\n(\r\n    [CustomerId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [FirstName] NVARCHAR(40)  NOT NULL,\r\n    [LastName] NVARCHAR(20)  NOT NULL,\r\n    [Company] NVARCHAR(80),\r\n    [Address] NVARCHAR(70),\r\n    [City] NVARCHAR(40),\r\n    [State] NVARCHAR(40),\r\n    [Country] NVARCHAR(40),\r\n    [PostalCode] NVARCHAR(10),\r\n    [Phone] NVARCHAR(24),\r\n    [Fax] NVARCHAR(24),\r\n    [Email] NVARCHAR(60)  NOT NULL,\r\n    [SupportRepId] INTEGER,\r\n    FOREIGN KEY ([SupportRepId]) REFERENCES "employees" ([EmployeeId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_CustomerSupportRepId] ON "customers" ([SupportRepId])',), ('CREATE TABLE "employees"\r\n(\r\n    [EmployeeId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [LastName] NVARCHAR(20)  NOT NULL,\r\n    [FirstName] NVARCHAR(20)  NOT NULL,\r\n    [Title] NVARCHAR(30),\r\n    [ReportsTo] INTEGER,\r\n    [BirthDate] DATETIME,\r\n    [HireDate] DATETIME,\r\n    [Address] NVARCHAR(70),\r\n    [City] NVARCHAR(40),\r\n    [State] NVARCHAR(40),\r\n    [Country] NVARCHAR(40),\r\n    [PostalCode] NVARCHAR(10),\r\n    [Phone] NVARCHAR(24),\r\n    [Fax] NVARCHAR(24),\r\n    [Email] NVARCHAR(60),\r\n    FOREIGN KEY ([ReportsTo]) REFERENCES "employees" ([EmployeeId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_EmployeeReportsTo] ON "employees" ([ReportsTo])',), ('CREATE TABLE "genres"\r\n(\r\n    [GenreId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Name] NVARCHAR(120)\r\n)',), ('CREATE TABLE "invoice_items"\r\n(\r\n    [InvoiceLineId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [InvoiceId] INTEGER  NOT NULL,\r\n    [TrackId] INTEGER  NOT NULL,\r\n    [UnitPrice] NUMERIC(10,2)  NOT NULL,\r\n    [Quantity] INTEGER  NOT NULL,\r\n    FOREIGN KEY ([InvoiceId]) REFERENCES "invoices" ([InvoiceId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION,\r\n    FOREIGN KEY ([TrackId]) REFERENCES "tracks" ([TrackId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_InvoiceLineInvoiceId] ON "invoice_items" ([InvoiceId])',), ('CREATE INDEX [IFK_InvoiceLineTrackId] ON "invoice_items" ([TrackId])',), ('CREATE TABLE "invoices"\r\n(\r\n    [InvoiceId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [CustomerId] INTEGER  NOT NULL,\r\n    [InvoiceDate] DATETIME  NOT NULL,\r\n    [BillingAddress] NVARCHAR(70),\r\n    [BillingCity] NVARCHAR(40),\r\n    [BillingState] NVARCHAR(40),\r\n    [BillingCountry] NVARCHAR(40),\r\n    [BillingPostalCode] NVARCHAR(10),\r\n    [Total] NUMERIC(10,2)  NOT NULL,\r\n    FOREIGN KEY ([CustomerId]) REFERENCES "customers" ([CustomerId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_InvoiceCustomerId] ON "invoices" ([CustomerId])',), ('CREATE TABLE "media_types"\r\n(\r\n    [MediaTypeId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Name] NVARCHAR(120)\r\n)',), ('CREATE TABLE "playlist_track"\r\n(\r\n    [PlaylistId] INTEGER  NOT NULL,\r\n    [TrackId] INTEGER  NOT NULL,\r\n    CONSTRAINT [PK_PlaylistTrack] PRIMARY KEY  ([PlaylistId], [TrackId]),\r\n    FOREIGN KEY ([PlaylistId]) REFERENCES "playlists" ([PlaylistId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION,\r\n    FOREIGN KEY ([TrackId]) REFERENCES "tracks" ([TrackId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_PlaylistTrackTrackId] ON "playlist_track" ([TrackId])',), (None,), ('CREATE TABLE "playlists"\r\n(\r\n    [PlaylistId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Name] NVARCHAR(120)\r\n)',), ('CREATE TABLE sqlite_sequence(name,seq)',), ('CREATE TABLE sqlite_stat1(tbl,idx,stat)',), ('CREATE TABLE "tracks"\r\n(\r\n    [TrackId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\r\n    [Name] NVARCHAR(200)  NOT NULL,\r\n    [AlbumId] INTEGER,\r\n    [MediaTypeId] INTEGER  NOT NULL,\r\n    [GenreId] INTEGER,\r\n    [Composer] NVARCHAR(220),\r\n    [Milliseconds] INTEGER  NOT NULL,\r\n    [Bytes] INTEGER,\r\n    [UnitPrice] NUMERIC(10,2)  NOT NULL,\r\n    FOREIGN KEY ([AlbumId]) REFERENCES "albums" ([AlbumId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION,\r\n    FOREIGN KEY ([GenreId]) REFERENCES "genres" ([GenreId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION,\r\n    FOREIGN KEY ([MediaTypeId]) REFERENCES "media_types" ([MediaTypeId]) \r\n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\r\n)',), ('CREATE INDEX [IFK_TrackAlbumId] ON "tracks" ([AlbumId])',), ('CREATE INDEX [IFK_TrackGenreId] ON "tracks" ([GenreId])',), ('CREATE INDEX [IFK_TrackMediaTypeId] ON "tracks" ([MediaTypeId])',)]

Of course, we can always use PRAGMA functions to check out the table info. This option (using the code below) is significantly easier to read, but it's still no better than using the CLI.

In [4]:
meta = c.execute("PRAGMA table_info('albums')")
for r in meta:
    print(r)
(0, 'AlbumId', 'INTEGER', 1, None, 1)
(1, 'Title', 'NVARCHAR(160)', 1, None, 0)
(2, 'ArtistId', 'INTEGER', 1, None, 0)

This is where Pandas comes in to save the day. Because Pandas comes with a .read_sql() method, we can quickly whip up a function that runs a SQL query on our database and returns the result as a Pandas Dataframe.

In [5]:
# conn = sqlite3.connect('chinook.db')
def run_query(query):
    try:                             
        df = pd.read_sql(query, conn)  
    except Exception as e:
        print(e.message)
    return df

If we run our table name query again using Pandas, we have a slick-looking table with all the table names in a single column rather than in a list of tuples.

In [6]:
table_names = run_query('''SELECT name
                           FROM sqlite_master 
                           WHERE type IN ("table","view") AND name NOT LIKE "sqlite_%"
                           ORDER BY 1;''')
table_names
Out[6]:
name
0 albums
1 artists
2 customers
3 employees
4 genres
5 invoice_items
6 invoices
7 media_types
8 playlist_track
9 playlists
10 tracks

What about our query for all the table schemas? Unfortunately this weakness still plagues us, even when enlisting the help of Pandas.

In [7]:
run_query('''SELECT sql 
             FROM sqlite_master
             ORDER BY tbl_name, type DESC, name;''')
Out[7]:
sql
0 CREATE TABLE "albums"\r\n(\r\n [AlbumId] IN...
1 CREATE INDEX [IFK_AlbumArtistId] ON "albums" (...
2 CREATE TABLE "artists"\r\n(\r\n [ArtistId] ...
3 CREATE TABLE "customers"\r\n(\r\n [Customer...
4 CREATE INDEX [IFK_CustomerSupportRepId] ON "cu...
5 CREATE TABLE "employees"\r\n(\r\n [Employee...
6 CREATE INDEX [IFK_EmployeeReportsTo] ON "emplo...
7 CREATE TABLE "genres"\r\n(\r\n [GenreId] IN...
8 CREATE TABLE "invoice_items"\r\n(\r\n [Invo...
9 CREATE INDEX [IFK_InvoiceLineInvoiceId] ON "in...
10 CREATE INDEX [IFK_InvoiceLineTrackId] ON "invo...
11 CREATE TABLE "invoices"\r\n(\r\n [InvoiceId...
12 CREATE INDEX [IFK_InvoiceCustomerId] ON "invoi...
13 CREATE TABLE "media_types"\r\n(\r\n [MediaT...
14 CREATE TABLE "playlist_track"\r\n(\r\n [Pla...
15 CREATE INDEX [IFK_PlaylistTrackTrackId] ON "pl...
16 None
17 CREATE TABLE "playlists"\r\n(\r\n [Playlist...
18 CREATE TABLE sqlite_sequence(name,seq)
19 CREATE TABLE sqlite_stat1(tbl,idx,stat)
20 CREATE TABLE "tracks"\r\n(\r\n [TrackId] IN...
21 CREATE INDEX [IFK_TrackAlbumId] ON "tracks" ([...
22 CREATE INDEX [IFK_TrackGenreId] ON "tracks" ([...
23 CREATE INDEX [IFK_TrackMediaTypeId] ON "tracks...

What we can do, however, is quickly create a dictionary containing DataFrames of each table's schema — which is the next best thing.

In [8]:
schema = {}
for table in table_names['name']:
    schema[table] = run_query('PRAGMA table_info({})'.format(table))

Once our schema dictionary is created, we can examine the table info for any table in our database with ease:

In [9]:
schema['tracks']
Out[9]:
cid name type notnull dflt_value pk
0 0 TrackId INTEGER 1 None 1
1 1 Name NVARCHAR(200) 1 None 0
2 2 AlbumId INTEGER 0 None 0
3 3 MediaTypeId INTEGER 1 None 0
4 4 GenreId INTEGER 0 None 0
5 5 Composer NVARCHAR(220) 0 None 0
6 6 Milliseconds INTEGER 1 None 0
7 7 Bytes INTEGER 0 None 0
8 8 UnitPrice NUMERIC(10,2) 1 None 0

One aspect I really enjoy about interacting with SQLite in this fashion is the ability to assign both queries and their results to variables for later use. By leveraging this, my code feels like Python from beginning to end rather than a disjointed Python/SQL hybrid.

For example, let's say I want to create a DataFrame of all the tracks whose names begin with the letter "a". I can save my query as a string, a_tracks_query, and also save the resulting DataFrame as its own variable, a_tracks.

In [10]:
a_tracks_query = '''SELECT *
                    FROM tracks
                    WHERE Name LIKE "a%"
                    ORDER BY Name;'''

a_tracks = run_query(a_tracks_query)
a_tracks
Out[10]:
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
0 236 A Banda 23 1 7 None 132493 4349539 0.99
1 3118 A Bencao E Outros 247 1 7 None 421093 14234427 0.99
2 3209 A Benihana Christmas, Pts. 1 & 2 251 3 22 None 2519436 515301752 1.99
3 873 A Carta 70 1 7 Djavan - Gabriel, O Pensador 347297 11493463 0.99
4 793 A Castle Full Of Rascals 63 1 1 Ian Gillan, Roger Glover, Jon Lord, Steve Mors... 311693 10159566 0.99
5 298 A Cor Do Sol 26 1 8 Bernardo Vilhena/Da Gama/Lazão 231392 7663348 0.99
6 311 A Cor Do Sol 27 1 8 Bernardo Vilhena/Da Gama/Lazao 273031 9142937 0.99
7 1731 A Cura 142 1 7 None 280920 9260588 0.99
8 2833 A Day In the Life 227 3 18 None 2620245 462818231 1.99
9 2129 A Different Kind Of Blue 176 1 10 Brian Eno, Bono, Adam Clayton, The Edge & Larr... 120816 3884133 0.99
10 533 A E O Z 42 1 4 Mutantes 518556 16873005 0.99
11 290 A Estrada 26 1 8 Bino Farias/Da Gama/Lazão/Toni Garrido 248842 8275673 0.99
12 302 A Estrada 27 1 8 Da Gama/Lazao/Toni Garrido 282174 9344477 0.99
13 2771 A Festa Do Santo Reis 223 1 7 None 159791 5204995 0.99
14 419 A Kind Of Magic 36 1 1 Roger Taylor 262608 8689618 0.99
15 220 A Luz De Tieta 21 1 7 Caetano Veloso 251742 8507446 0.99
16 2970 A Man And A Woman 235 1 1 Adam Clayton, Bono, Larry Mullen & The Edge 270132 8938285 0.99
17 2825 A Measure of Salvation 227 3 18 None 2563938 489715554 1.99
18 2792 A Melhor Forma 224 1 4 Titãs 191503 6349938 0.99
19 1767 A Menina Dança 145 1 7 Caetano Veloso e Gilberto Gil 129410 4326918 0.99
20 3481 A Midsummer Night's Dream, Op.61 Incidental Mu... 326 2 24 None 387826 6497867 0.99
21 1967 A Moça e a Chuva 161 1 16 Mônica Marianno 274625 8929357 0.99
22 2438 A New Flame 141 1 1 Mick Hucknall 237662 7822875 0.99
23 1917 A Noite Do Meu Bem 158 1 7 Dolores Duran 220081 7125225 0.99
24 1105 A Novidade 73 1 7 Gilberto Gil 324780 10765600 0.99
25 1099 A Novidade (Live) 86 1 7 None 316969 10508000 0.99
26 377 A Paz 33 1 7 Donato/Gilberto Gil 263183 8619173 0.99
27 1111 A Paz 73 1 7 Gilberto Gil 293093 9593064 0.99
28 2962 A Room At The Heartbreak Hotel 234 1 1 U2 274546 9015416 0.99
29 596 A Rua 47 1 7 None 238027 7930264 0.99
... ... ... ... ... ... ... ... ... ...
169 887 As Aparências Enganam 71 1 7 None 247379 8014346 0.99
170 1521 As Dores do Mundo 123 1 7 Hyldon 255477 8537092 0.99
171 1084 As Pegadas Do Amor 85 1 10 Gilberto Gil 209136 6899062 0.99
172 2348 As Profecias 192 1 1 None 232515 7657732 0.99
173 671 As Rosas Não Falam (Beth Carvalho) 53 1 7 None 116767 3836641 0.99
174 2668 As Tears Go By 216 1 1 Jagger/Richards/Oldham 164284 5357350 0.99
175 2424 As The Years Go by 197 1 1 Albert King 233064 7566829 0.99
176 464 As We Sleep 38 1 2 None 316865 10429398 0.99
177 1078 Asa Branca 85 1 10 Humberto Teixeira/Luiz Gonzaga 217051 7387183 0.99
178 984 Asas 78 1 7 None 231915 7711669 0.99
179 2616 Ashes And Ghosts 212 1 1 Billy Duffy/Bob Rock/Ian Astbury 300591 9787692 0.99
180 929 Ashes To Ashes 74 1 4 Bill Gould/Jon Hudson/Mike Bordin/Mike Patton/... 217391 7093746 0.99
181 1741 Assaltaram A Gramática 143 1 7 None 261041 8698959 0.99
182 1717 Assim Caminha A Humanidade 142 1 7 None 210755 6993763 0.99
183 1080 Assum Preto 85 1 10 Humberto Teixeira/Luiz Gonzaga 199653 6625000 0.99
184 415 Astronomy 35 1 3 A.Bouchard/J.Bouchard/S.Pearlman 397531 13065612 0.99
185 3087 Atomic Punk 244 1 1 Edward Van Halen, Alex Van Halen, Michael Anth... 182073 5908861 0.99
186 231 Atras Da Porta 23 1 7 None 189675 6132843 0.99
187 221 Atrás Da Verd-E-Rosa Só Não Vai Quem Já Morreu 21 1 7 David Corrêa - Paulinho Carvalho - Carlos Sena... 307252 10364247 0.99
188 1733 Atrás Do Trio Elétrico 142 1 7 None 149080 4917615 0.99
189 879 Atrás da Porta 71 1 7 None 166608 5432518 0.99
190 1872 Attitude 153 1 3 Hetfield, Ulrich 315898 10335734 0.99
191 2755 Até Que Enfim Encontrei Você 222 1 7 Tim Maia 105064 3477751 0.99
192 2500 Ava Adore 202 1 4 Billy Corgan 261433 8590412 0.99
193 3416 Ave Maria 285 2 24 Franz Schubert 338243 5605648 0.99
194 982 Avisa 78 1 7 None 355030 11844320 0.99
195 1721 Aviso Aos Navegantes 143 1 7 None 242808 8058651 0.99
196 871 Azul 70 1 7 Djavan 253962 8381029 0.99
197 2753 Azul Da Cor Do Mar 222 1 7 Tim Maia 197955 6475007 0.99
198 867 Açai 70 1 7 Djavan 270968 8893682 0.99

199 rows × 9 columns

Later on, I may change my mind and want a_tracks to only contain the names:

In [11]:
a_tracks = a_tracks[['Name']]
a_tracks.head(10)
Out[11]:
Name
0 A Banda
1 A Bencao E Outros
2 A Benihana Christmas, Pts. 1 & 2
3 A Carta
4 A Castle Full Of Rascals
5 A Cor Do Sol
6 A Cor Do Sol
7 A Cura
8 A Day In the Life
9 A Different Kind Of Blue

After altering a_tracks, I realize that I still need the rest of the info for those tracks for later analysis. Rather than going back and changing my code or rewriting the query I can simply call my run_query function, passing in a_tracks_query as my parameter, and assign it to a new variable.

In [12]:
a_tracks_info = run_query(a_tracks_query)
a_tracks_info.head(10)
Out[12]:
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
0 236 A Banda 23 1 7 None 132493 4349539 0.99
1 3118 A Bencao E Outros 247 1 7 None 421093 14234427 0.99
2 3209 A Benihana Christmas, Pts. 1 & 2 251 3 22 None 2519436 515301752 1.99
3 873 A Carta 70 1 7 Djavan - Gabriel, O Pensador 347297 11493463 0.99
4 793 A Castle Full Of Rascals 63 1 1 Ian Gillan, Roger Glover, Jon Lord, Steve Mors... 311693 10159566 0.99
5 298 A Cor Do Sol 26 1 8 Bernardo Vilhena/Da Gama/Lazão 231392 7663348 0.99
6 311 A Cor Do Sol 27 1 8 Bernardo Vilhena/Da Gama/Lazao 273031 9142937 0.99
7 1731 A Cura 142 1 7 None 280920 9260588 0.99
8 2833 A Day In the Life 227 3 18 None 2620245 462818231 1.99
9 2129 A Different Kind Of Blue 176 1 10 Brian Eno, Bono, Adam Clayton, The Edge & Larr... 120816 3884133 0.99

Although this is a very basic example, it is easy to see how querying a SQLite database in this way can add flexibility and cohesion to your Python project.

For a more complete illustration feel free to check out this visualization notebook I made back when I was working through Codecademy's Data Visualization course. In it, I chose to combine the SQL and visualization portions of the project into a single Jupyter Notebook using the exact techniques discussed here. The database used for that project contained sample data from the Twitch data science team, consisting of roughly 60,000 rows of data.

3. Visual Studio Code: The perfect middle ground?

SQLite in VS Code

Despite the usefulness of Jupyter for exploring SQLite databases, more often than not my projects don't require both Python and SQL. As such, my go-to method for exploring SQLite databases is actually through none other than Visual Studio Code.

I already use VS Code as my primary code editor, in large part because of the sheer number of quality extensions that are available for it. One of the extensions that I use regularly is the SQLite extension by alexcvzz, which provides what I find to be an excellent and intuitive SQLite interface.

Another benefit of using VS Code is that your SQL queries and output will match your chosen color theme. In this regard, VS Code provides the most customizable experience that I've found for local SQLite usage.

SQLite VS Code Color GIF

Using SQLite with VS Code

To get started, first make sure VS Code is installed on your computer. Once it is, open it up and go to the Extensions sidebar. On Windows, the keyboard shortcut for this is Ctrl + Shift + X.

In the extensions search field, type sqlite and you will see the results containing VS Code extensions for SQLite. It should look something like this:

SQLite Extension

From here, simply click Install on the extension called SQLite by axexcvzz. You can always add a different extension later, but SQLite is more than enough for my needs and is very easy to use.

With the SQLite extension installed, you can start using it by opening the Command Palette (View > Command Palette or Ctrl + Shift + P on Windows) and typing sqlite. From here you can see all of the available SQLite actions:

SQLite Extension Commands

Let's say you want to open up the chinook.db database and query it. To do this, click on Open Database in the Command Palette and select the database file. This will add a new collapsible section to your Explorer sidebar called SQLite Explorer, where you can click through to see the schema of your database and tables. If you want to look through a table's contents, just click on the play button icon to the right of the table name to view the entire table. This is the equivalent of querying SELECT * FROM table_name;:

SQLite Extension Explorer

If you want to type your own queries, there are a couple of ways you can go about it. One way is to click the New Query button to the right of the database name. This will open up a new file where you can type as many queries as you like, and VS Code will automatically know that you are querying that database. In this example, we just have the one database open: chinook.db.

To run all queries in the file, open the Command Palette and select SQLite: Run Query (Ctrl + Shift + Q on Windows). Alternatively, if you only want to run a particular query from your file, you can highlight it and select SQLite: Run Selected Query from the Command Palette.

SQLite Extension New Query

You can also create a new .sql file, or open an existing one, and query from that file. However, if you do this VS Code will not automatically assume that you are trying to query from the database you have opened in the SQLite Explorer sidebar. When you go to run your query for the first time, you will be prompted to choose either the in-memory database or to choose a database file:



Once you have selected the database file (in this example chinook.db), queries are run exactly the same as if you were using the file generated by the New Query button.

SQLite Extension Query From File

As you can see, VS Code can be a powerful and flexible tool for exploring SQLite databases once it is set up correctly. Although I didn't cover them here, there are even more features of the SQLite extension, including exporting your query results as HTML, JSON or CSV. I highly encourage anyone who is interested to give it a try and see whether it fits into your arsenal of SQL exploration tools.

4. DB Browser for SQLite: The powerful and beginner-freindly GUI

DB4S Homepage

Finally, we come to DB Browser for SQLite. Of the four methods I've discussed thus far, this is the one with which I have the least experience. Until recently I didn't even consider using a GUI app for SQLite. However, after hearing a lot of good things about it I finally downloaded DB Browser for SQLite (DB4S) and took it for a spin. Much to my surprise, I discovered a tool that was both powerful and easy to use. In fact, I would go so far as to recommend it to both beginners and seasoned professionals due to the fact that the user can make it as easy or as complicated as they want. Although I don't see myself needing it for any SQLite tasks in the near future, I look forward to incorporating it into my workflow more often — particularly when I want to take a quick look through a larger database with multiple tables.

Using DB4S

Installing DB4S is easy. Simply head on over to the download page and follow the instructions for your operating system. If you are a Dark Mode fiend like me, consider downloading the alpha release for version 3.12.0. As of this writing, it is the only version that supports a working dark theme for Windows and Mac OS (Dark Mode has been supported on Linux since version 3.11.1). If you decide to go with a version that supports Dark Mode, this can be selected in Edit > Preferences > Application style. You can also customize the font, color and size of your SQL queries under the SQL tab in Preferences, though I find the defaults look nice in both the dark and default modes.

DB4S Customization

With DB4S installed, opening a database file is as simple as clicking the Open Database button near the top of the screen and selecting the file you want to open. One of the greatest things about DB4S is that although you don't need to be familiar with SQL commands to use it, it will show you the SQL code for every command the program uses to display data to the user.

For example, just by opening chinook.db, we are presented with an interactive Database Structure menu, along with the SQL Log on the right hand side. The SQL Log shows the PRAGMA functions and SELECT statement the app called on our database to display this info. Thus, beginners get the convenience using the expandable menu, while also being exposed to the commands used under the hood.

DB4S Database Structure and Log

Moving over to the Browse Data tab, we see that a few more commands get added to the SQL Log each time a table is selected from the dropdown menu. If you aren't super comfortable writing SQL queries by hand, the Database Structure and Browse Data tabs are likely where you are going to be spending most of your time. However, even for those of us who have plenty of SQL practice under our belts, the Browse Data view includes several cool features for quickly filtering the data.

For instance, clicking on any of a table's column names instantly sorts the data by that column, in ascending order (click one more time for descending order). Combine this with the Filter field under each column name and you can blaze through what might otherwise turn into complicated queries. In the screenshot below I achieved the equivalent of this query...

SELECT * 
FROM tracks
WHERE 
  Name LIKE 'M%' AND
  Composer LIKE 'C%' AND
  Milliseconds > 100000 AND
  GenreId LIKE '%7%'
ORDER BY Name ASC;
...with a couple of clicks and 12 keystrokes.

DB4S Browse Data Filters

Point-and-click features aside, DB Browser for SQLite makes it easy to query your database manually, either by typing your queries on the fly or by opening a SQL file where you've already prepared and saved your queries. To utilize either of these options, merely navigate over to the Execute SQL tab.

Since I already had a bunch of queries saved in a file called sample.sql, I can open that file and hit the Execute button (or Ctrl + Enter, F5 or Ctrl + R on Windows). This will execute all queries in the file or whichever query I have highlighted. However, it is important to note that the output section of the app will only display the results for the most recent query that you have run.

DB4S Execute SQL

As mentioned earlier, DB4S can become as complex as you want it to, and there are dozens of features that can't be covered in this post but are worth checking out. For example, the Plot feature (seen in the image at the top of the post) might really interest some readers. Others might be more intrigued by the ability to edit Pragmas at the touch of a button. Either way, so long as its developers stay true to the simple-yet-powerful design that it currently embodies, I'll continue to recommend DB Browser for SQLite as an excellent addition to any SQLite toolkit.