Berlin Web Week
RSS  Twitter  Calendar

Tag-Archiv für 'Apache'

Apache Hadoop Get Together – 29.9.2009

As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. There are quite a few good restaurants nearby, so we can go there after the official part.

Talks scheduled so far:

  • Thorsten Schuett, Solving Puzzles with MapReduce: MapReduce is most often used for data mining and filtering large datasets. In this talk we will show that it also useful for a completely different problem domain: solving puzzles. Based on MapReduce, we can implement massively parallel breadth-first and heuristic search. MapReduce will take care of the hard problems, like parallelization, disk and error handling, while we can concentrate on the puzzle. Throughout the talk we will use the sliding puzzle (http://en.wikipedia.org/wiki/Sliding_puzzle) as our example.
  • Thilo Götz, Text analytics on jaql: Jaql (JSON query language) is a query language for Javascript Object Notation that runs on top of Apache Hadoop. It was primarily designed for large scale analysis of semi-structured data. I will give an introduction to jaql and describe our experiences using it for text analytics tasks. Jaql is open source and available from http://code.google.com/p/jaql.
  • Uwe Schindler, Lucene 2.9 Developments: Numeric Search, Per-Segment- and Near-Real-Time Search, new TokenStream API: Uwe Schindler presents some new additions to Lucene 2.9. In the first half he will talk about fast numerical and date range queries (NumericRangeQuery, formerly TrieRangeQuery) and their usage in geospatial search applications like the Publishing Network for Geoscientific & Environmental Data (PANGAEA). In the second half of his talk, Uwe will highlight various improvements to the internal search implementation for near-real-time search. Finally, he will present the new TokenStream API, based on AttributeSource/Attributes that make indexing more pluggable. Future developments in the Flexible Indexing Area will make use of it. Uwe will show a Tokenizer that uses custom attributes to index XML files into various document fields based on XML element names as a possible use-case.

We would like to invite you, the visitor to also tell your Hadoop story, if you like, you can bring slides – there will be a beamer.

A big Thanks goes to the newthinking store for providing a room in the center of Berlin for us. Another big thanks goes to Cloudera for sponsoring videos of the talks. Links to the videos will be posted here as well as on the Cloudera blog. Yet another big thanks goes to O’Reilly for providing three “Hadoop: The Definitive Guide” books that will be raffled at the event.

Apache Hadoop Get Together
Webseite
Wann
Di, 29.9.2009 17:00 - 21:00
Wo

Tucholskystr. 48
10117 Berlin (Mitte)
Germany
Zum Kalender hinzufügen

Apache Hadoop Get Together Berlin – 25.6.2009

The newthinking store Berlin is hosting the Hadoop Get Together user group meeting. It features talks on Hadoop, Lucene, Solr, UIMA, katta, Mahout and various other projects that deal with making large amounts of data accessible and processable. The event brings together leaders from the developer and user communities. The speakers present projects that build on top of Hadoop, case studies of applications being built and deployed on Hadoop. After the talks there is plenty of time for discussion, some beer and food. There is also a related Xing Group on the topic of building scalable information retrieval systems. Feel free to join and meet other developers dealing with the topic of building scalable solutions.

Agenda

Talks scheduled so far:
Torsten Curdt: Data Legacy – the challenges of an evolving data warehouse

MapReduce is great for processing great data sets. A distributed file system can be used to store huge amounts of data. But what if your data format needs to adapt to new requirements? This talk will cover a simple introduction to Thrift and Protocol Buffers and sprinkle in some rants and approaches to manage your big data sets.

Christoph M. Friedrich (Fraunhofer Institute for Algorithms and Scientific Computing): “SCAIView – Lucene for Life Science Knowledge Discovery”.
‘Apache Hadoop Get Together Berlin – 25.6.2009′ weiterlesen

Apache Hadoop Get Together Berlin
Beginn
Do, 25.6.2009 17:00
Ende
19:30
Wo

Tucholskystr. 48
10117 Berlin (Mitte)
Germany
Zum Kalender hinzufügen

4. Hadoop Get Together Berlin – 5.3.2009

We are hosting the Hadoop Get Together user group meeting. It features talks on Hadoop, Lucene, Solr, UIMA, katta, Mahout and various other projects that deal with making large amounts of data accessible and processable. The event brings together leaders from the developer and user communities. The speakers present projects that build on top of Hadoop, case studies of applications being built and deployed on Hadoop. After the talks there is plenty of time for discussion, some beer and food. There is also a related Xing Group on the topic of building scalable information retrieval systems. Feel free to join and meet other developers dealing with the topic of building scalable solutions.

Lars George will talk on the topic of deploying HBase in an production environment.

If you yourself would like to give a presentation: There are additional slots of 20 minutes each available. There is a beamer provided. Just bring your slides. To include your topic on this web site as well as the upcoming.org entry, please send your proposal to Isabel

After the talks there will be time for an open discussion. We are going into a nearby restaurant after the event so there will be plenty of time for talking, discussing and new ideas.

4. Hadoop Get Together Berlin
Webseite
Beginn
Do, 5.3.2009 16:00
Ende
Do, 5.2.2009 20:00
Wo

Tucholskystr. 48
10117 Berlin (Mitte)
Germany
Zum Kalender hinzufügen