Real Time Data Warehousing Presentation and Video

At the March Boston MySQL User Group meeting, Jacob Nikom of MIT’s Lincoln Laboratory presented “Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications.” In the middle of the talk, Jacob said he sometimes calls what he did in this application as “real-time data warehousing”, which was so accurate I decided to give that title to this blog post.

The slides can be downloaded in PDF format (1.3 Mb) at http://www.technocation.org/files/doc/Concurrent_database_performance_02.pdf. The 54 minute video can be downloaded (644Mb) at http://technocation.org/node/693/download or streamed directly in your browser at http://technocation.org/node/693/play.

This talk discussed how to do real-time retrieval operations while doing concurrent high volume insertion, including:


  • How to keep up with 1.5 Mb/second per server incoming data stream

  • server hardware comparison between a multi-core AMD Opteron and a multi core Intel Xeon

  • MySQL/Postgres comparison

  • schema design

  • design of the storage/retrieval benchmark

  • tuning MySQL

Jacob showed the insertion time from the number of applied indexes. He also demonstrated the excellent responsiveness of the MySQL server both in simulated and actual surveillance.

At about 7 minutes into the presentation, Jacob begins to discuss “marshalling”, which is converting the XML to data and back. After the 20-minute mark, an audience member asks about what marshalling is, so I wanted to make sure that folks have the definition ahead of time.

At the March Boston MySQL User Group meeting, Jacob Nikom of MIT’s Lincoln Laboratory presented “Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications.” In the middle of the talk, Jacob said he sometimes calls what he did in this application as “real-time data warehousing”, which was so accurate I decided to give that title to this blog post.

The slides can be downloaded in PDF format (1.3 Mb) at http://www.technocation.org/files/doc/Concurrent_database_performance_02.pdf. The 54 minute video can be downloaded (644Mb) at http://technocation.org/node/693/download or streamed directly in your browser at http://technocation.org/node/693/play.

This talk discussed how to do real-time retrieval operations while doing concurrent high volume insertion, including:


  • How to keep up with 1.5 Mb/second per server incoming data stream

  • server hardware comparison between a multi-core AMD Opteron and a multi core Intel Xeon

  • MySQL/Postgres comparison

  • schema design

  • design of the storage/retrieval benchmark

  • tuning MySQL

Jacob showed the insertion time from the number of applied indexes. He also demonstrated the excellent responsiveness of the MySQL server both in simulated and actual surveillance.

At about 7 minutes into the presentation, Jacob begins to discuss “marshalling”, which is converting the XML to data and back. After the 20-minute mark, an audience member asks about what marshalling is, so I wanted to make sure that folks have the definition ahead of time.