Monday, September 30, 2013

Using Video within a Big Data Query for IVS

What is the real value to digital video in a Big Data query? Should video even be a consideration for an other source of data?

Most Big Data applications are out for one simple thing: Getting decision driving information from multiple, disparate, structured and unstructured data sources. If all we had to do is query our RDBMs to get to good decisions, then Big Data as a market would not exist. But it isn't and things like weblogs and twitter feeds - all examples for large sets of unstructured data, are extremely useful for data scientists to use to find decision driving data.

Well, what about video? Can digital video be useful for a query? Early indications are that it is and the more you think about it, the more you find good, solid use cases in the industry. IVS (intelligent video surveillance) is clearly an early adopter to use video. The more that intelligent, digital computer vision algorithms can be combined with queries of other databases to form a real-time, interactive query process, the more intelligible and actionable information we can glean from the query. Yes, there are computer vision algorithms that can search through a digital video file and detect anomalies. But despite the current speed, can we combine that search with a recognition database, like a facial recognition and give back intelligible information. What about height, speed, color, make and model or even an analysis of behavior?

What if an IVS system could automate a query that someone has entered a safe perimeter, that he is recognized as a known person on the terrorist watch list and is spending time cutting through a fence? What security level would this anomaly trip to get action at the highest level. Anyone who claims that any anomaly should trigger highest level does not understand the IVS market. After three or four false alarms, most companies turn off their detection and revert to good, old-fashioned human visual detection.

Big Data has this promise, but video is not an integral part of the query capabilities yet. There is nothing on the market today that allows companies to bring a digital video feed in and query it using SQL like any other data. But that could soon change (stay tuned).

Monday, September 23, 2013

Can we automatically index UGC?

Here is one of the largest dilemas under the current Big Data sun: will be ever be able to automatically index user generated video content (UGC)? Seems like a an impossibility, but would have great benefits if possible.

Many multiple TVs video
Why would we want to create automatic indices and meta data for the uploaded content? Well for one, it would make the mass of user created content that is uploaded to sites like YouTube searchable. Making this mass of video content searchable means we can find the content we are interested in, saving hours of sifting through content that is only saved by the good graces of someone who titled the video correctly or added descriptive meta data to their uploads.

Video is a content rich source of data, since we humans perceive much more data from video than could ever be tagged. But an auto-tagging system could learn over time to get better using facial and behavioral recognition patterns to improve accuracy. This would be an immediate and significant improvement over the voluntary methods we currently use.

The basis for starting automated indexing would be to bring the digital video stream to a standard format where it could be combined with other data sources and included in a standard SQL search. That is now possible, as you can see by my previous posts.

Wednesday, September 18, 2013

Why is transfer of lots-of-data an issue for Big Data?


This may just be a rhetorical question, but there are a few companies developing technologies that allow a massive amount of data to be transported to a DW environment where - using the latest Big Data tools - this data can be analyzed and further used within larger queries. 


One such company is Attunity (www.attunity.com) and their latest product blog post states clearly there solution allows data to boldly go where and when you need it.

What I don't understand is this: Wasn't that the original use for MapReduce that data did not have to reside in a singular DW so that I could query it? As I understand it, this was Google's motivation to build this technology since they clearly understood that all data of the web would not be located on one spot.

According to a paper from TerraData there is a growing issue in the amount of data produced by oil wells and getting that data transferred across the country or countries to a data center is expensive. I guess my simplistic way of design is to ask: Why not put together a couple of servers at the drilling site, build a Hadoop cluster remotely onsite and remotely run queries on it where it is? Isn't it harder to try to store, transport and restore the massive amount of data?

Just a question. Please comment.

Thursday, September 12, 2013

Breakthrough: Hadoop-based Video Analytics

hd_iconPivotal, a company formed from EMC and VMware has created a breakthrough for video analysis. Dr. Victor Fang from their Big Data Science Lab has a demo where he chews through and analyses 5 GB of MPEG-2 video using Pivotal HD and HAWQ in near real-time. His output are exception images, place of movement, speed and many other parameters of the detection.

Why is this big? It will revolutionize security industry's Intelligent Video System market because it will learn to recognize exceptions and feed these to security personnel. Now, the system will intelligently provide information to the monitoring team, rather than hoping to detect with parameter movement algorithms, which have yet to prove themselves.

Watch this video:
http://www.gopivotal.com/resources#http://bitcast-a.v1.sjc1.bitgravity.com/greenplum/pivotalvideo/Unstructured_Data_Video_Analytics_on_Hadoop_Session_1.m4v

Monday, September 9, 2013

What are good Big Data opportunities that Startups should pursue

The question has come up in a great Quora forum, which I highly recommend and should get a number of members in this group excited about either joining or founding a startup. Most companies I speak with do not have the capability to analyze data like they could using the Big Data tools, the time difference between payoff and investment is far too long to warrant this type of investment.

So, a huge startup opportunity is in providing the information these companies would like to have - and based on some of these ideas, large corporations are willing to spend a pretty penny on these.

The four areas of that Startups could make money using Big Data that I find most promising are:

1.) Recruiting: this industry is still suffering from poor conclusions from unstructured data
2.) Online Video Processing: much could be machine recognized to enable indexing
3.) Knowledge Management: still a fleeting target for many companies
4.) Store Location Information: the right mix of parameters could be gold for some

Any others? Comments?

Here the link to the Quora forum: http://www.quora.com/What-Big-Data-opportunities-will-be-most-interesting-and-profitable-for-startups

Thursday, September 5, 2013

"We don't really buy hardware anymore,"

This quote from a Louisiana based healthcare staffing company. Five years ago, they would buy about 5,000 servers a year and now this year only bought one from HP. (see details in article below). This is a significant change that most IT workers out there had better listen to intently.

[image]Who does this trend not effect? It does not effect the server makers as a total, since computing power is still going to be needed in the world. But it effects the data center workers who have just learned how to get everything working with their server farm. This will be a museum piece as more and more companies will put their workloads out to the cloud.

It is time for the IT department to start a plan to transition out of their blissful life of being the most powerful organization in the company to being an enabler of information flow with a budget of 50%.

CIOs who do not have a 5 year plan to decrease their budget by 50% and increase the IT functionality by 300% need to be replaced with those thinking outside of the box, who can efficiently meld the trends of this transition with the business needs of their company. This along with Big Data are going to be the greatest transitions we have seen in IT since the implementation of RDBMs. Like so many trends, many will be caught in the middle, unaware that things would change so quickly.

If there is anything we should learn from the trends of the past, everything will change much quicker than we thought it would. You can bet your career on it!