tag:blogger.com,1999:blog-623696624102177636.post4083105053922056759..comments2023-09-29T07:02:35.675-07:00Comments on Anything but R-bitrary: Manipulating Data Frames Using sqldf - A Brief OverviewChris Nicholashttp://www.blogger.com/profile/09479130398118611479noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-623696624102177636.post-85834294975352763622013-04-02T17:26:07.072-07:002013-04-02T17:26:07.072-07:00Hello Skan and thanks for the question.
I have ...Hello Skan and thanks for the question. <br /><br />I have crafted a simple example that creates a zoo object based on the results returned by a call to sqldf (arbitrarily filtering a generated data.frame). I then aggregate over the zoo object using a coarser grained representation of the time variable (similar to the example provided in the zoo documentation).<br /><br />#load required libraries<br />library(sqldf)<br />library(zoo)<br /><br />#create some data<br />data=data.frame(val1=rnorm(12, mean=20),t=as.Date("2003-02-01") + c(1, 3, 7, 9, 11, 14, 17, 20, 22, 25, 28, 30) - 1)<br /><br />#add a column to filter data with<br />data$val2=c(rep("a",6),rep("b",6))<br /><br />#add a column with a coarser grained date for use with aggregate<br />data$year_qtr=as.yearqtr(data$t)<br /><br />#arbitrarily query the data.frame using sqldf<br />filter.data=sqldf("select val1, val2, t, year_qtr from data where val2='b'")<br /><br />#create a zoo object<br />the.zoo=zoo(filter.data$val1,filter.data$t)<br /><br />#aggregate using zoo<br />aggregate(the.zoo,filter.data$year_qtr,mean)<br /><br />A value similar to this:<br /><br />2003 Q1 <br />20.42594 should be returned<br /><br />A call to sqldf returns a data.frame so you simply need to make sure you pass in the correct vector arguments to zoo in order to make it work. Hope that helps and good luck with those large files.Chris Nicholashttps://www.blogger.com/profile/09479130398118611479noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-31971469348705389412013-04-02T13:19:30.594-07:002013-04-02T13:19:30.594-07:00Thanks for the comment Raffael. Exponentiation is ...Thanks for the comment Raffael. Exponentiation is supported via the POWER function.<br /><br />Try this example using the built-in dataset warpbreaks:<br /><br />data(warpbreaks)<br />sqldf("select power(breaks,2) as 'breaks^2' from warpbreaks limit 6")<br /><br />sqldf will return a new column named breaks^2 containing the squared value of breaks.<br /><br />The functions supported by SQLite are <a href="http://www.devart.com/dotconnect/sqlite/docs/UDF_implemented.html" rel="nofollow">here</a>Chris Nicholashttps://www.blogger.com/profile/09479130398118611479noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-77630411054204951302013-03-30T17:35:31.548-07:002013-03-30T17:35:31.548-07:00I use sqldf a lot and I am amazed at how simple an...I use sqldf a lot and I am amazed at how simple and fast it operates on data frames. But I was very much surprised when I found that I couldn't use exponentiation.Anonymoushttps://www.blogger.com/profile/14471874439152798805noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-78807700889426600192013-03-26T18:29:00.083-07:002013-03-26T18:29:00.083-07:00Hello
How would you use it jointly with the xts o...Hello<br /><br />How would you use it jointly with the xts or zoo packages?<br />For example I want to apply the aggregate zoo command to two 10GB time series (I want to use zoo because they have different timing or are non-homogeneous and I cannot just glue them together).<br /><br />The problem is that I don't know how to use sqldf (or hadoop or ff...) as input for zoo.<br />Regardsskanhttps://www.blogger.com/profile/03631114761711061847noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-3047162312326916892012-08-30T09:22:54.798-07:002012-08-30T09:22:54.798-07:00I've uploaded the files so they can be downloa...I've uploaded the files so they can be downloaded - I embedded links within the post. Thanks!Josh Millshttps://www.blogger.com/profile/10878471065992438501noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-53831258825696150632012-08-24T08:17:32.790-07:002012-08-24T08:17:32.790-07:00Any chance you can share the datasets for reproduc...Any chance you can share the datasets for reproducibility's sake?BODhttps://www.blogger.com/profile/18132559299348207667noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-54503239585484610352012-08-08T20:23:15.514-07:002012-08-08T20:23:15.514-07:00The query shown that does not use sqldf could have...The query shown that does not use sqldf could have. sqldf automatically pulls in the RSQLite.extfuns library when using it with the default sqlite database so all that library's functions including lower_quantile, upper_quantile and median are available out of the box. See http://sqldf.googlecode.com/#Example_15._Use_of_RSQLite.extfuns_library_functionsGaborhttps://www.blogger.com/profile/11757402795312667103noreply@blogger.comtag:blogger.com,1999:blog-623696624102177636.post-40479084026375059372012-08-08T09:34:10.040-07:002012-08-08T09:34:10.040-07:00I LOVE how you've done it in knitr!I LOVE how you've done it in knitr!Tal Galilihttps://www.blogger.com/profile/10009278769907250225noreply@blogger.com