Using SQLite in R

Using SQLite in R

I received a ‘database-in-a-file’ from SQLite, a filename called [shell]uniprot_annotation.sqlite3[/shell]. It contains around 6 million connection between uniprot genes and some pathways and the idea of using it, is to improve the access speed. To import it to R, first the packakge [shell]RSQLite[/shell] needs to be installed:

[shell]
install.packages(“RSQLite”)
[/shell]

Then we start by making the connection to the database

[shell]
library(“RSQLite”)
con <- dbConnect(SQLite(), dbname="uniprot_annotation.sqlite3") [/shell] If the given filename does not exist, a new database is created, otherwise the existing database is opened. To show the tables inside this database connection, we can then type [shell] dbListTables(con) [/shell] Assuming that there is a data table called [shell]uniprot2ko[/shell] inside our database, one would copy the whole table to a R data frame by typing [shell] uni2ko <- dbGetQuery( con,'select * from uniprot2ko' ) [/shell] although this would be obviously a rather stupid thing to do, as we would loose then the speed we get in accessing the database with more defined queries. A more reasonable query would be e.g. to get all rows, where a certain field equals a certain term. Lets assume,w e want to get all uniprot IDs that are associated to a certain KEGG orthology (e.g. K00001), then we would type [shell] K0001 <- dbGetQuery( con,'select * from uniprot2ko where ko="K00001"' ) [/shell] Of course in the query command you can use now all types of different queries as known from SQL. Once you do not need the SQLite connection anymore, it can be closed by typing [shell] dbDisconnect(conn = con) [/shell] If the aim is to write an existing R data frame (e.g. called [shell]myData[/shell]) into a SQLite table, one would do the following. First establish a connection to a new file, that does not exist on the hard drive [shell] library("RSQLite") con <- dbConnect(SQLite(), dbname="myData.sqlite3") [/shell] At that stage you can see that the file was created on the HDD, but is just 0 bytes large. Next, we need to populate the database by writing the data frame into it [shell] dbWriteTable(con, "myDataDB", myData) [/shell] Here, [shell]myDataDB[/shell] is the table of the new created database table and [shell]myData[/shell] is the data frame that we want to write into it. From the HDD we can see then also that the file was populated and we can connect to it in the way, as described above. Back to: Introduction to R – Overview

Leave a Reply

%d bloggers like this: