My helpful screenshot

In a previous post I presented the jts-discretizer, a small project able to approximate geometries with geo-hashes. Now it’s the time to see it in action: we will work in Scala on a dataset of country borders(multi-polygons) and see how to approximate country surfaces with geo-hashes.

The original dataset

We will use borders provided by Natural Earth. A simplified version of this dataset can be found at borders_wkt.csv, having the following columns:

  • continent
  • country name
  • fips country code (-99 if not present in the Natural Earth dataset)
  • iso country code (-99 if not present in the Natural Earth dataset)
  • border multi-polygon as WKT

Dependencies

  "io.github.adrianulbona" % "jts-discretizer" % "0.1.0"
  "com.github.tototoshi" %% "scala-csv" % "1.3.4"

Load the dataset

case class Id(continent: String, name: String, fips: String, iso: String) {
    def toList: List[String] = List(continent, name, fipsCode, isoCode)
}
case class Border(id: Id, wkt: String)

def read(): List[Border] = {
    val reader = CSVReader.open(new File("data/borders_wkt.csv"))
    try {
        reader.all().map({
            case List(continent, name, fips, iso, wkt)
                => Border(Id(continent, name, fips, iso), wkt)
        })
    }
    finally reader.close()
}

Discretize multi-polygons

case class Surface(country: Id, geoHashes: Set[GeoHash]) {
    def toList: List[String] =
        country.toList :+ geoHashes.map(_.toBase32).mkString(",")
}

def discretize(precision: Int)(border: Border): Surface = {
    val geometry = new WKTReader().read(border.wkt)
    val refTime = currentTimeMillis()
    val geoHashes = new DiscretizerFactoryImpl()
        .discretizer(geometry)
        .apply(geometry, precision).asScala.toSet
    val duration = currentTimeMillis() - refTime
    println(s"Discretized ${border.id.name} in $duration ms.")
    Surface(border.id, geoHashes)
}

Write surface approximations

def write(discretizations: List[Surface]): Unit = {
    val writer = open(new File(s"data/countries_discretized.csv"))
    try {
        writer.writeAll(discretizations.map(_.toList))
    }
    finally writer.close()
}

Putting all together

val precision = 5
val surfaces read()
    .par
    .map(discretize(precision)(_))
    .toList
write(surfaces)

Results

countries_discretized_3.csv

countries_discretized_4.csv

countries_discretized_5.csv

Complete SBT project at github.com/adrianulbona/borders