I have an old love affair with large data sets. It really feels cool to have a database over ALL birds in the world, a network graph of all connecting flights in the world or detailed data over all the countries in the world.
When I was nine or ten years old all I wanted for Christmas was Statistisk Årsbok.
I got it. I loved it.
My bookshelves contains a plethora of different non fiction titles chosen to pick every subject possible. I got books on medieval weapons, about steam machines, gardening and tropical diseases.
I don't know why I am so fascinated on the thought of collecting all the world's data but I do suspect that collecting is one of the reasons. Another reason is probably the pursue of knowledge. A third the sense of having control over the world. The rest probably comes from my childhood, they say that most stuff originates from it. Mine was good.
Even better than data sets are data sets with a position. Maps. They say that 95% of all information can be positioned. I don't know if it's true, but I used it as a fact when I was teaching Geographic Information Systems(GIS) at the university.
GIS can be a troublesome experience for a developer. The main players are huge systems that not always are easy to develop in. If you ever tried coding in AML or MapBasic you know what I mean.
Maps, how easy they may seem, are complicated beasts and in a GIS system you add data layer by layer usually from different sources. Land data from one place, vegetation from another, roads from a third and so on. Each source can be of a different cartographic projection and level of detail meaning that your aggregated map very well can end up with roads in the ocean and country borders that overlap. I really hate cartographic projections by the way. It is old technique when you had to map the round world onto a flat paper. Let the world be round in a computer! Projections just makes it more complex and I want my life simple. (I guess that is one reason why I love Google Earth)
Since version 2008 you can also put maps into
SQL Server, which opens up new hacking possibilities that I am eager to explore. The problem, however, is that most of the map data out there is in a different format.
The industry standard in map data is the Shape File used in ArcGIS, nowadays in competition with KMZ used by Google.
SQL Server supports none of these.
What's worse, I haven't really found any easy free way to import data. I installed GeoKettle after first having to learn how to install a .jar file with adminstrator rights. When I finally got it started, I didn't really understand it. I just want to import some data and I don't want a GUI that is more complex than choosing a source and a destination. I checked out FME, which is good but not free, but they can just read from SQL Server Spatial data. I need to write. I finally ended up with OGR2OGR from GDAL, which means using the good old DOS prompt instead of same fancy GUI. Not a simple solution (I even had to compile it myself) but I guess I'll just have to try it as soon as my huge data set comes in. This link made it seem promising. So hopefully I will soon control the world! :)