My helpful screenshot

In a previous post I presented the jts-discretizer, a small project able to approximate geometries with geo-hashes. Now it’s the time to see it in action: we will work in Scala on a dataset of country borders(multi-polygons) and see how to approximate country surfaces with geo-hashes.

The original dataset

We will use borders provided by Natural Earth. A simplified version of this dataset can be found at borders_wkt.csv, having the following columns:

  • continent
  • country name
  • fips country code (-99 if not present in the Natural Earth dataset)
  • iso country code (-99 if not present in the Natural Earth dataset)
  • border multi-polygon as WKT

Dependencies

  "io.github.adrianulbona" % "jts-discretizer" % "0.1.0"
"com.github.tototoshi" %% "scala-csv" % "1.3.4"

Load the dataset

case class Id(continent: String, name: String, fips: String, iso: String) {
def toList: List[String] = List(continent, name, fipsCode, isoCode)
}
case class Border(id: Id, wkt: String)

def read(): List[Border] = {
val reader = CSVReader.open(new File("data/borders_wkt.csv"))
try {
reader.all().map({
case List(continent, name, fips, iso, wkt)
=> Border(Id(continent, name, fips, iso), wkt)
})
}
finally reader.close()
}

Discretize multi-polygons

case class Surface(country: Id, geoHashes: Set[GeoHash]) {
def toList: List[String] =
country.toList :+ geoHashes.map(_.toBase32).mkString(",")
}

def discretize(precision: Int)(border: Border): Surface = {
val geometry = new WKTReader().read(border.wkt)
val refTime = currentTimeMillis()
val geoHashes = new DiscretizerFactoryImpl()
.discretizer(geometry)
.apply(geometry, precision).asScala.toSet
val duration = currentTimeMillis() - refTime
println(s"Discretized ${border.id.name} in $duration ms.")
Surface(border.id, geoHashes)
}

Write surface approximations

def write(discretizations: List[Surface]): Unit = {
val writer = open(new File(s"data/countries_discretized.csv"))
try {
writer.writeAll(discretizations.map(_.toList))
}
finally writer.close()
}

Putting all together

val precision = 5
val surfaces read()
.par
.map(discretize(precision)(_))
.toList
write(surfaces)

Results

countries_discretized_3.csv

countries_discretized_4.csv

countries_discretized_5.csv

Complete SBT project at github.com/adrianulbona/borders