Archiv für April 2012
This is a handy getting started guide for computer vision using R from e.g. surveillance cameras, as all the with R-bloggers: it contains the needed source code.
…just answered: Where are the Semantic web incubators? Any thoughts on building an economic ecosystem for Semantic web to keep momentum enough to attract si
My basic message is: as long as your startup wants to use semantic tools/infrastructures (vs. providing tools and infrastructures) you likely not need a specific incubator, as semantic web just influences the tech part of your startup.
Semantic Web: Where are the Semantic web incubators? Any thoughts on building an economic ecosystem for Semantic web to keep momentum enough to attract sizable investment? 1 answer on Quora
How to work with Google n-gram data sets in R using MySQL. via R-Bloggers: I like really much about this blog the focus interesting things along with code examples to try it out on your own.
N-Grams datasets can also be created from your own texts using NTLK functions (see http://nltk.googlecode.com/svn/trunk/doc/howto/collocations.html ): in analytical use cases N-grams give you a better basis to have a machine ‘understand’ the meaning of a text (compared to looking at the words individually.
—Update from 2012-04-10:
Stefan Keller ( http://twitter.com/sfkeller ) hinted me to a blog entry about how to use n-grams in a PostgreSQL based setting to optimize search functionality.
The Wikidata project ( http://meta.wikimedia.org/wiki/W… ) somehow follows the path of the DBPedia project ( dbpedia.org ) in the regard to connect/collect the described facts in Wikipedia, the extent to which it will follow the semantic web & ontology standards is still open. (current status can be seen at http://meta.wikimedia.org/wiki/W… ).
(and yes pictures may help people to identify things)
To my understanding it is much more important to be able to work on your own ontology subset and to link it with somehow more general documented wisdom. (e.g. linking to DBPedia concepts from your own terms using the SKOS vocabulary at http://www.w3.org/TR/skos-primer/ .)
(My experience from this kind of standardization projects is that you might be able to manage the technical side of it, but the organizational and managerial aspect get very complicated once you target a singel taxonomy).
Last year, we covered an ambitious collaborative R&D project called "Startup Genome," created by three young entrepreneurs, Bjoern Herrmann, Max Marmer, and Ertan Dogrultan. The goal of the ongoing project was (and is) to take a comprehensive, data-driven dive into what makes tech startups successful -- and not so successful.
Out of its research came, among other things, Startup Compass…
I just wanted to post some remarks to the very interesting blog from David Smith, as I was able to take a deeper look to the HANA appliance:
- I agree with the statement that tool providers focus today on ‘high-performance analytics’:But the most important steps in the SAP-/ERP-world is still to be done: too much of analytics domain information is today deeply buried in application code, BI tools from the past were merely seen as pure inspection tools to this information.
- SAP is about to place more application logic on the database layer, which in perspective enables more of David’s “more than just basic analytics”: the usage of (optimizable) prediction models could be possible then.
- (I remember especially a very interesting use case for “more than just basic analytics” from an SAP discussion: appliances like HANA with specific application functionality enable a production company/facility to evaluate the ‘best’ scenario of how to fulfill orders in taking into account the bills of material and facts like availability of parts in case of limitations.)
SAP in fact announced formal R integration:
- The so called “HANA pocketbook” ( at https://www.experiencesaphana.com/servlet/JiveServlet/previewBody/1436-102-1-1946/SAP%20HANA%20Pocketbook-DRAFT.pdf ) describes the high level picture of R integration (starting on p. 59).
- Alvaro Tejada ( @blag ) posted a number of blogs on R integration with HANA: http://scn.sap.com/people/alvaro.tejadagalindo3 I consider him to be the R-mastermind inside SAP.
The complete R integration was not present in the previews of HANA I have seen: the key to the success of R in the SAP world is to which level constraints for R are in place: e.g. whether all the nice machine learning/hadoop enablers for R can be used. ( only a small-scale R-language support would not be sufficient for these use cases.
I’ll start with this blog entry a session of business ideas, which come up near me… which I cannot pursue at the moment, but are maybe interesting for others.
Tagline: Offer spare time activities to people planning a trip fitting to their interests and their personal time planning.
Technology: Mashup of APIs used from travel planning tools (like tripit.com or dopplr.com ) and crawled/stored information about events, touristic activites etc. based on user profiling e.g. from Facebook Likes.
Business models: mainly affiliate model (bringing guests to organizers of events/tour organizers)
Martin Hepp ( @mfhepp ), the author of the GoodRelations vocabulary for eBusiness, just posted a cookbook entry to show how business entities offering travel activities (outdoor, concerts etc.) can publish this information in a machine-readable way.
(I think) for this reason he defined the Ticket Ontology to describe events, activities and their business impact.
But for the time being (as long as not many travel organizer make their activities machine-readable) a crucial technical part is the collection of travel activites and making / keeping connections with these business entities offering activities.
Even this idea can make use of BigData analysis techniques: you can initially optimize and later predict, which kind of activity is attractive to which group of users. (a use case of customer segmentation).