Main menu:

Site search

Categories

Tags

intern

Oracle und SQL

Perl

PostGIS: find the regions with the highest density

For my perl Module Geo::Heatmap you need to find a the maximum number of point per tile (or rather per bin). You may achieve this by parsing the the logs, or use the Power of PostGIS:

To group geographically close points you may use the ST_GeoHash function:


with geohash as (
select ST_GeoHash(geom::geometry, 5) st_geohash,
geom
from geodata where
not(St_X(geom) < -180 or St_Y(geom)< -90 or St_X(geom) > 180 or St_Y(geom) > 90)
)
SELECT ST_Extent(geom) as extent FROM geohash
where st_geohash =
(select st_geohash from
(select st_geohash, count(*) c
from geohash
group by st_geohash
order by c desc limit 1) max_geohash)
)

The number 5 used in the function denotes the length of the hash the geom points should have in common. If you group the points and order them by their count you’ll find the most densely populated regions in/of your data.

The function ST_Extent then gives you the lat/long Box containing this Points. Using only the first line of the result may not lead to the absolutly “correct” solution but for the purpose of scaling it is “good enough”

Geo::Heatmap for Google Maps

Yesterday I released a new Version of Geo::Heatmap (0.16) it should be available on the CPAN mirror of your choice
Please see my older post to the same topic

Two major changes:

  • I got rid of Image::Magik (which ain’t fun to install) and replaced it with Imager which installs easily
  • The example provided now uses the Version 3 of the Google Maps API

The example (Geo::Heatmap Demonstration) still looks the same

Coding example

The “calling” HTML

  <head>
     <meta name="viewport" content="initial-scale=1.0, user-scalable=no" />
     <style type="text/css">
       html { height: 100% }
       body { height: 100%; margin: 0; padding: 0 }
       #map-canvas { height: 100% }
     </style>
     <script type="text/javascript"
       src="https://maps.googleapis.com/maps/api/js?key=&sensor=true">
     </script>
     <script type="text/javascript">
       var overlayMaps = [{
         getTileUrl: function(coord, zoom) {
           return "hm.fcgi?tile="+coord.x+"+"+coord.y+"+"+zoom;
         },

         tileSize: new google.maps.Size(256, 256),
         isPng: true,
         opacity: 0.4
       }];

       function initialize() {
         var mapOptions = {
           center: new google.maps.LatLng(48.2130, 16.375),
           zoom: 9
         };
         var map = new google.maps.Map(document.getElementById("map-canvas"),
             mapOptions);

       var overlayMap = new google.maps.ImageMapType(overlayMaps[0]);
       map.overlayMapTypes.setAt(0,overlayMap);

       }
       google.maps.event.addDomListener(window, 'load', initialize);

     </script>
   </head>
   <body>
     <div id="map-canvas"/>
  </body>

The (f)cgi script

#!/usr/bin/env perl

use strict;
use FCGI;
use DBI;
use CHI;
use FindBin qw/$Bin/;
use lib "$Bin/../lib";

use Geo::Heatmap;

my $cache = CHI->new( driver => 'File',
         root_dir => '/tmp/domainmap'
     );

our $dbh = DBI->connect("dbi:Pg:dbname=gisdb", 'gisdb', 'gisdb', {AutoCommit => 0});

my $request = FCGI::Request();

while ($request->Accept() >= 0) {
  my $env = $request->GetEnvironment();
  my $p = $env->{'QUERY_STRING'};

  my ($tile) = ($p =~ /tile=(.+)/);
  $tile =~ s/\+/ /g;

  # package needs a CHI Object for caching
  #               a Function Reference to get LatLOng within a Google Tile
  #               maximum number of points per zoom level

  my $ghm = Geo::Heatmap->new();
  $ghm->palette('palette.store');

  # in this case the zoom scale is static - basically it is the count of the tile with the most values
  $ghm->zoom_scale( {
    1 => 298983,
    2 => 177127,
    3 => 104949,
    4 => 90185,
    5 => 70338,
    6 => 37742,
    7 => 28157,
    8 => 12541,
    9 => 3662,
    10 => 1275,
    11 => 417,
    12 => 130,
    13 => 41,
    14 => 18,
    15 => 10,
    16 => 6,
    17 => 2,
    18 => 0,
  } );

  $ghm->cache($cache);
  $ghm->return_points( \&get_points );
  my $image = $ghm->tile($tile);

  my $length = length($image);

  print "Content-type: image/png\n";
  print "Content-length: $length \n\n";
  binmode STDOUT;
  print $image;

}

sub get_points {
  my $r = shift;

  my $sth = $dbh->prepare( qq(select ST_AsEWKT(geom) from geodata
                         where geom &&
              ST_SetSRID(ST_MakeBox2D(ST_Point($r->{LATN}, $r->{LNGW}),
                                                       ST_Point($r->{LATS}, $r->{LNGE})
                        ),4326))
              );

  $sth->execute();

  my @p;
  while (my @r = $sth->fetchrow) {
    my ($x, $y) = ($r[0] =~/POINT\((.+?) (.+?)\)/);
    push (@p, [$x ,$y]);
  }
  $sth->finish;
  return \@p;
}

Migration Julianischer zu Gregorianischer Kalender

Migration Julianischer zu Gregorianischer Kalender

mysql


select adddate('1582-10-04', interval 1 day);
| adddate('1582-10-04', interval 1 day) |
| 1582-10-05 |
FAIL!

php


$date = new DateTime('1582-10-04');
$interval = $date->add(new DateInterval('P01D'));
echo $interval->format('Y-m-d');
1582-10-05
FAIL!

python


>>> import datetime
>>> e = datetime.date(1582, 10, 4)
>>> delta = datetime.timedelta(days=1)
>>> print e+delta
1582-10-05
FAIL!!!

postgres


postgres=# select to_char(to_date('15821004', 'YYYYMMDD') + 1, 'DD.MM.YYYY');
to_char ------------
05.10.1582 (1 row)
FAIL!

perl


perl -MDateTime -le 'print DateTime->new(year => 1582, month => 10, day => 4)->add(days => 1)'
1582-10-05T00:00:00
FAIL!

oracle


SQL> select to_char( to_date('1582104', 'yyyymmdd')+1, 'dd.mm.yyyy') from dual;
TO_CHAR(TO ----------
15.10.1582
GOOD!

R


> dtj <- strptime(c("5.10.1582"), format = "%d.%m.%Y");
> dtj + 86400
[1] “1582-10-06 CET”

Registry Data Escrow format

a downloadable/versioned Version of the Registry Data Escrow format from http://tools.ietf.org/html/draft-arias-noguchi-registry-data-escrow-04 may be found at github

Generating Heatmap/Density Overlays for Google Maps

Quite often one wants to visualize how your customers/incidents/whatever are distributed geographically.

preview-geo-heatmap

I’ve created a quick hack which does most of the stuff for you. The only obstacle you have to overcame is to encode all your data into long/lat format - Google Geocoder might be of some help. The coordinates have to be stored in database which allows for querying which points are within a defined long/lat box. I would recommand postgres/postgis for this task.

A real life example for Geo::Heatmap - Google Heatmaps is here

The most recent code may be found at github

Or preferably via cpan, so just have to do

cpanm Geo::Heatmap

The code still has a lot of rough corners, but if there is enough demand I’ll keep on working on it. And as everybody who sees it has some idea what he/she would like to have visualized with it, I am quite optimistic that this is going to happen. So don’t hesitate to tell me about your suggestions/bugs/fixes/etc. (admin at trust-box.at)

Have fun, but do not forget:

XML Generation in Perl - how it should have always been

At first big thanks to Mark Overmeer for XML::Compile. I had the pleasure to meet Mark in .nl once - cheers and carry on the great work.

Whenever I had to create XML (either with perl or php) I did it one way or the other with some sort of templating toolkit. In php I use(d) http://www.tinybutstrong.com/ for creating xml . In perl several templating systems come into mind like

  • tt2
  • Template::Declare
  • and many others more
  • Feeling Wrong

    this always felt wrong and awkward for several reasons:

  • You need to build/have the xml before (and creating xml from xsd is not one of my many hobbies)
  • You/the templating system have to take care whether or not a whole tree needs to be displayed and so on
  • The Rescue: XML::Compile

    This nifty Module comes to the rescue. Using XML::Compile make xml+xsd behave like I always wanted to.

    I want to demonstrate this using epp as an example. I want to create a valid epp frame for creating a contact object.

    The epp schemas may be obtained by a simple internet search eg here.

    You may create a valid epp frame by reading the both the contact-1.0.xsd and the epp-1.0.xsd each approx. 400 lines of thrilling xml. Or use some code like:

    1. Convert xsd to perl hash

    use strict;
    use warnings;
    
    use XML::Compile::Schema;
    
    my $schema = XML::Compile::Schema->new([
         'epp-xsd/epp-1.0.xsd',
         'epp-xsd/eppcom-1.0.xsd',
         'epp-xsd/contact-1.0.xsd',
                          ]);
    
    my $s = $schema->template('PERL' => '{urn:ietf:params:xml:ns:contact-1.0}create');
    print $s;
    

    which gives you a more readable idea what your data should look like:

    # is a x0:createType
    { # sequence of id, postalInfo, voice, fax, email, authInfo, disclose
    
      # is a xs:token
      # length <= 16
      # length >= 3
      id => "token",
    
      # is a x0:postalInfoType
      # occurs 1 <= # <= 2 times
      postalInfo =>
      [ { # sequence of name, org, addr
    
          # is a xs:normalizedString
          # length <= 255
          # length >= 1
          name => "example",
    ....
    

    Using this perl hash template you create the first part of your epp-xml

    2. Use perl hash to xml conversion

    use strict;
    use warnings;
    
    use XML::Compile::Schema;
    
    my $schema = XML::Compile::Schema->new([
         'epp-xsd/epp-1.0.xsd',
         'epp-xsd/eppcom-1.0.xsd',
         'epp-xsd/contact-1.0.xsd'
       ]);
    
    my $write  = $schema->compile(WRITER => '{urn:ietf:params:xml:ns:contact-1.0}create');
    my $doc    = XML::LibXML::Document->new('1.0', 'UTF-8');
    my $hash = {
                  id => 'idid',
                  postalInfo => {
                    'name' => 'name',
                    'addr' => {
                        'street' => ['street', 'street2'],
                        'city'   => 'city',
                        'cc'     => 'cc',
                    },
                    type => 'int',
                  },
                  email => 'mymail',
                  "authInfo" => {pw => "PWauthInfo"},
               };
    my $xml    = $write->($doc, $hash);
    $doc->setDocumentElement($xml);
    
    print $doc->toString(1);
    

    this leads to the following xml

    <?xml version="1.0" encoding="UTF-8"?>
    <x0:create xmlns:x0="urn:ietf:params:xml:ns:contact-1.0">
      <x0:id>idid</x0:id>
      <x0:postalInfo type="int">
        <x0:name>name</x0:name>
        <x0:addr>
          <x0:street>street</x0:street>
          <x0:street>street2</x0:street>
          <x0:city>city</x0:city>
          <x0:cc>cc</x0:cc>
        </x0:addr>
      </x0:postalInfo>
      <x0:email>mymail</x0:email>
      <x0:authInfo>
        <x0:pw>PWauthInfo</x0:pw>
      </x0:authInfo>
    </x0:create>
    

    this xml has to be wrapped into an epp frame. As the epp frame uses xml any elements some “manual” work is necessary for creating the complete epp frame. In principle you start with {urn:ietf:params:xml:ns:epp-1.0}epp at step one.

    3. Generate epp-xml-frame and wrap contact-create command into it

    use strict;
    use warnings;
    
    use XML::Compile::Cache;
    use XML::Compile::Schema;
    
    my $cache = XML::Compile::Cache->new([
         'epp-xsd/eppcom-1.0.xsd',
         'epp-xsd/epp-1.0.xsd',
         'epp-xsd/extensions.xsd',
         'epp-xsd/contact-1.0.xsd',
       ]);
    
    my $create_contact_ns = '{urn:ietf:params:xml:ns:contact-1.0}create';
    my $epp_frame_ns = '{urn:ietf:params:xml:ns:epp-1.0}epp';
    my $prefixes = {'urn:ietf:params:xml:ns:contact-1.0' => 'contact'};
    
    $cache->declare(WRITER => [$create_contact_ns, $epp_frame_ns, ],
           (
             prefixes => $prefixes,
             use_default_namespace => 1,
             include_namespaces => 1,
            )
          );
    
    $cache->compileAll;
    
    my $doc = XML::LibXML::Document->new('1.0', 'UTF-8');
    $doc->setStandalone(0);
    
    my $contact_data = {
                    id => 'idid',
                    postalInfo => {
                      'name' => 'name',
                      'addr' => {
                          'street' => ['street', 'street2'],
                          'city'   => 'city',
                          'cc'     => 'cc',
                      },
                      type => 'int',
                    },
                    email => 'mymail',
                    "authInfo" => {pw => "PWauthInfo"},
                 };
    
    my $xml = $cache->writer($create_contact_ns)->($doc, $contact_data);
    
    my $epp_frame_data = {
    	         command => {
                  create => {
                    '{urn:ietf:params:xml:ns:contact-1.0}create' => $xml,
                    },
                  clTRID => "token",
                 },
               };
    
    my $eppxml = $cache->writer($epp_frame_ns)->($doc, $epp_frame_data);
    
    $eppxml->setNamespace( 'http://www.w3.org/2001/XMLSchema-instance', 'xsi', 0 ); ## append additonal ns for the feelinx
    
    print $doc->toString(1) .
          $eppxml->toString(1) ."n";
    

    Voila, we have a valid epp frame without even touching the xml! Again Kudos to Mark Overmeer for making this possible!

    4. The result

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <epp xmlns="urn:ietf:params:xml:ns:epp-1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <command>
        <create>
          <contact:create xmlns:contact="urn:ietf:params:xml:ns:contact-1.0">
            <contact:id>idid</contact:id>
            <contact:postalInfo type="int">
              <contact:name>name</contact:name>
              <contact:addr>
                <contact:street>street</contact:street>
                <contact:street>street2</contact:street>
                <contact:city>city</contact:city>
                <contact:cc>cc</contact:cc>
              </contact:addr>
            </contact:postalInfo>
            <contact:email>mymail</contact:email>
            <contact:authInfo>
              <contact:pw>PWauthInfo</contact:pw>
            </contact:authInfo>
          </contact:create>
        </create>
        <clTRID>token</clTRID>
      </command>
    </epp>
    

    Das XML der Statistik Austria

    Lobenswerterweise stellt die die Statistik-Austria sehr viele Daten zur “freien” Verfügung ins Netz. So auch alle Straßen- und Ortsnamen Österreichs. Wer sich allerdings das XML ausgedacht hat zb:

    <datensatz>
    <element>10101</element>
    <element>Eisenstadt</element>
    <element>00001</element>
    <element>Eisenstadt</element>
    <element>000001</element>
    <element>Josef Stanislaus Albach-Gasse</element>
    <element>7000</element>
    <element>10101</element>
    </datensatz>

    der hat geistig den Umstieg von csv zu xml noch nicht ganz verkraftet ;-)

    Registrar(Partner) bei der switch.ch werden

    Nachdem ich ja beruflich mit einigen Registries zu tun habe - ich arbeite primär für die nic.at - kenne ich einige Zulassungsprozesse für ccTLD Registries.

    Der bei weitem “lustigste” Prozess ist der der switch.ch, der Schweizer Registry. Dort sind nicht nur finanzielle und organisatorische Hürden zu nehmen (und auch die sind nicht zu knapp) sondern auch ein technischer Parcours.

    Dieser Parcours (der heißt wirklich so) besteht aus 23 epp transactionen (Querbeet, dh personen, hostobjekte, domains anlegen, löschen und so weiter) die einem Durchgang durchzuführen sind.

    Das bedeutet, daß jeder Registrar neben des Implementierungsaufwands für den normalen Clienten, sich nochmals zwei Tage (solange habe ich alles in allem gebraucht) Zeit & Geld nehmen darf um den Parcours zu implementieren. Man muß allerdings auch wirklich zugeben, wenn man den Parcour implementiert (und somit verstanden) hat, daß man auch die Prozesse der switch “durchschaut” hat.

    Scraping web pages in JavaScript with Perl

    Sometimes you want to scrape Webpages which contain JavaScript and therefore resist beeing scraped with Web::Scraper or the likes. Imagine some JavaScript code like the following to disguise a email address.

    function mail() {
    var name = "mail";
    var domain = "example.com";
    var mailto = 'mailto:' + name + '@' + domain;
    document.write(mailto);
    }
    mail();

    One could use somethink elaborate like Selenium to execute the code within a browser and then extract the address with “conventional” means. There are cases when this isn’t sufficent.
    Enter JavaScript::SpiderMonkey, which allows you to execute JavaScript Code on the console without a browser. The only problem remaining is that the console doesn’t provide some properties and methods the browser has, so you have to define them yourself. This happens from line 11-14 where we define the “document” and the method “write”. The rest of the code is pretty self explanatory.


    000: use strict;
    001: use warnings;
    002:
    003: use Slurp;
    004: use JavaScript::SpiderMonkey;
    005:
    006: my $js = JavaScript::SpiderMonkey->new();
    007: my $code = slurp('mailto.js');
    008:
    009: $js->init();
    010:
    011: my $obj = $js->object_by_path("document");
    012:
    013: my @write;
    014: $js->function_set("write", sub { push @write, @_ }, $obj);
    015:
    016: my $rc = $js->eval(
    017: $code
    018: );
    019:
    020: printf "document.write:\n%s\n", join "\n", @write;
    021: printf "Error: %s\n", $@;
    022: printf "Return Code: %s\n", $rc;
    023:
    024: $js->destroy();

    The output is:
    document.write:
    mailto:mail@example.com
    Error:
    Return Code: 1

    Oracle 11.2:IGNORE_ROW_ON_DUPKEY_INDEX - Please No!

    Oracle hat mit 11.2 neue optimizer hints “erfunden” und zwar auch IGNORE_ROW_ON_DUPKEY_INDEX.

    Er dient dazu (wie der Name schon sagt) bei einem INSERT  (und nicht bei einem UPDATE!) auf einen Unique Key die Exception zu ignorieren und das statement tut dann einfach nichts. Es würde dann also zb funktionieren (ohne unique key violation):

    insert into testtable (id, text) values (1, 'testtext der erste');
    insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(testtable(id))*/ into testtable (id, text) values (1, 'testtext der zweite');

    Dieser Hint ist eigentlich kein Hint sondern eine Option (…), und außerdem aus mehreren Gründen keine gute Idee:

    • andere Oracle Hints (zb für den Optimizer) verändern das Verhalten des Statments nicht, und sind somit kompatibel mit anderen Datenbanken
    • es gibt schon ein statment mit dem man den gleichen Effekt erreichen kann und zwar MERGE, welches man mittelfristig sicher auch in anderen Datenbanken wie Postgres erwarten kann

    Und gleich der Vollständigkeit halber das MERGE-Statement:

    merge into testtable
    using (select 1 from dual)
    on (id = :b_id)
    when not matched then
    insert (id, text) values (:b_id, :b_text);

    ich verwende deswegen bind Variablen weil man ansonsten den Wert :b_id zweimal einsetzen müßte. Entgegen anderslautenden Gerüchten muß es keinen “when matched … where 2=1″ (oder Ähnliches) Abschnitt geben. Das “select 1 from dual” dient dazu um genau ein Zeile zum bekommen und somit die Zahl der zu behandelnden Zeilen zu determinieren.