Wednesday, December 18, 2013

The Focal Point of Data Integration: Did Salesforce Get It Right?

Ff647273.archdataintegration_f03(en-us,PandP.10).gif

We've all been brought up believing that the focus of data integration is a back office function and that this process should be something that is completed by large ERP and CRM applications deep in the bowels of the IT department. All ERP vendors promote their own add-ons that can facilitate the data integration process.

The problem with data integration is not the transfer of data, which has to go through a migration process and follows a rigorous MDM (master data management) exercise to determine a unified enterprise-wide data schema. No, the problem lies in determining the final functionality and higher level business goals that need to be reached after the integration has been completed.

Traditional integration projects worry about ensuring that the back office systems are integrated and are based on sometimes justifiable needs to share data between systems. But the focal point of determining what data needs to be integrated from which back office systems has to happen at the customer level, in the front office to ensure that the business process flows correctly.

This is where Salesforce.com has it right. A company should determine what their sales people, customers or partners need on a smart phone or tablet to ensure that the business process flows faster and more efficiently. Based on that need, an integration project needs to be completed. The focal point of all data integration needs to be in the front office as SFDC proposes, not in the back office where we usually find it.

Thursday, December 12, 2013

Last Night: I was invited to Advanced Screening of The Hobbit IMAX in 3D!

This was an amazing play on the 3D effects that only come to life when viewed in full IMAX technology. I have never seen anything like it and it gets a full 2 thumbs up! It was simply amazing. This is as much a roller coaster ride as I have ever experience even in real life. The 3D makes this movie amazing, so that the experience is more than you physically fathom and I walked out drained of energy.

Go and see this, but don't waste your time with 2D or anything less than IMAX!

Thursday, November 28, 2013

2013 Dreamforce: Lots to learn from results

I was unable to attend Dreamforce, which is the major event put on by Salesforce.com, but there were a lot of comments that could boil down some of the lessons for many of us who weren't there.

On this blog, I have spoken much about the Cloud transition and how the datacenter of the future will be in the cloud. Many companies offer real cloud solutions for companies ready to go, the leader in this industry is AWS. But for the mainstream customer with a data center the transition to the cloud is a major disruption and until things break, they are unwilling to move.

The other offering is the traditional SaaS model that companies can transition to, where a major business process transition that involves the use of a new SW package in the cloud can make the same transition. Here, there is often no choice. If the functionality you are searching for is in the cloud, then you will move to the cloud.

In the case of Salesforce.com, many companies are finding the best of both worlds. Yes, following an improved sales strategy with a CRM model will lead you to their core CRM package and therefore to the cloud, since SFDC (=Salesforce.com) does not offer any other model. But what they have been able to create is a cloud based platform that can take the core of your customer data and expand it to include all other data your company needs or will need from here. SFDC can run your company's cloud transition completely from soup to nuts. Yes, AWS offers the cheapest computing and storage option and many different SW packages are becoming available in a SaaS model. But no where will you find the complete data landscape package that will allow you a transition on your time, piecemeal or complete, as you can with SFDC.

Here are some choice links that I learned a lot from:

Many new ideas
Idea is the customer
Take aways from Marissa Meyers

Monday, November 11, 2013

PaaS the Fastest Growing Cloud Segment

This is something I have written about before, but it seems that the market has finally caught on: PaaS will be the dominant segment throughout 2014. For more information on exactly what the market is and to understand what PaaS means, read this.

Why PaaS? Because the easiest business case for cloud adoption is the area of Software development and operations, called DevOps. The business case is simple and ask anyone who has ever run a data center why. Whenever a new version or release of a piece of code was ready for functional/regression/production test, there needed to be a new environment created for that test. The datacenter guys had to find servers, implement servers, install servers and create a complete environment to complete the test. Once the test was done, the hardware needed to stay untouched, so that a new test (yes, they did find bugs) could be implemented quicker.

The problem here is not only unused hardware and wasted hardware set up skills, but the worst was the time it takes to get a new release out the door.

Cloud and especially PaaS promises to change that forever. Users (SW development guys) can set up an environment on their own as a cloud account, test, and then only use the hardware environment when they need it. As soon as they don't need it, the resources no longer cost money. DevOps is the easiest to understand use of Cloud computing.

Tuesday, November 5, 2013

Is Risk Really Holding Back Big Data Projects?

In his WSJ article titled "The Risks of Big Data for Companies", Dr. Jordan explains a number of risks that companies should be aware of before jumping into this new and unchartered market.

He outlines that too much data could be dangerous for companies who might invoke inner company politics or make unjustified decisions too early with the results they make. I would imagine there are no risks in ignoring information that the data gives us and just keep moving along as a horse with blinders.

WHAT? If people seriously took the advice of this column in the WSJ, we would never advance in business. More importantly the #1 risk that is keeping up every CEO is not whether a plant in Canada has better output than the same in the US, thusly creating an "inner company rivalry". No, the #1 threat is the competition. The competition can close down your market and make your company's revenue tumble.

Listening to this sort of advice is not only dangerous, but plainly stupid. No company that I have helped with Big Data projects has decided to throw out human intuition and experience for decisions made based on the data alone. Instead all functioning companies want to see whether the data can complement decision making. That is what a pilot will show and the ensuing business case will prove it, or not. Pilot programs should be done with Big Data in small areas to show what we could know, without fear that we will uncover some terrible new insights that could cause inner company conflict.

If you are starting a new Big Data project, please do NOT read this article. Instead follow your own intuition and call me in as an adviser.

Thursday, October 31, 2013

Big Data Projects: Who is the Organization's Big Data Leader

Deloitte Touche Tohmatsu Ltd.

The big question for any consulting organization trying sell Big Data projects is to find the Big Data leader in that organization. There is always one person in every organization who has spurred the company you are selling into to use the massive amounts of data that they have access to to create solid, decision driving data.

Who is that person? Is she the CIO? or is he the CMO? or could she be a Director-level person with a special charter? Tom Davenport, the Data Analytics Guru from Harvard discusses this well within his Wall Street Journal blog. You can read it here. Tom's opinion is pushing for a decision within a corporate organization for where the Big Data Leader should be. I agree with his overall complaint from the Deloitte poll shown below that the leader should be someone who works enterprise-wide, so that an overall Big Data strategy can evolve. In a perfect world that would be great.

But I think reality is different in the companies I have insight to when it comes to their Big Data adoption. In reality, a Big Data project comes out of a need to know more about your customers than your competitors know. It has to start with a dream somewhere in the organization that says, if we had this piece of information, we could sell more, grow market share or save the cost of creating products no one wants. This need could develop in an IT organization, but only if their goals are clearly aligned with the business goals, which sadly is not often the case. Most of the times, I have seen this need arise was in the business unit level, where goals are clearly tied to the success of selling products.

Monday, October 14, 2013

If 64% plan a Big Data project, why are so few IT consulting companies prepared?

-->

According to a Gartner survey, 64% of the companies surveyed plan to invest in some sort of Big Data project in 2013. Many of those surveyed will run some sort of pilot project, not investing in hardware as much as investing time with a singular goal: where will Big Data payoff? how much needs to be invested to get which results?

Despite this strong opportunity, few IT consulting organizations have strengthened their ranks with specialist who can guide their clients through the uncertainty of this new market. One indicator is the lack of career positions posted for Big Data sales specialists and consultants who can create a solid business assessment that is strong enough to justify further funding. There are some who are not only specialized in this area, but also host cloud environments, where their clients can start a real pilot without any data center investments. It seems that these few will do well in the upcoming boom as others scramble for Big Data market share.

Monday, September 30, 2013

Using Video within a Big Data Query for IVS

What is the real value to digital video in a Big Data query? Should video even be a consideration for an other source of data?

Most Big Data applications are out for one simple thing: Getting decision driving information from multiple, disparate, structured and unstructured data sources. If all we had to do is query our RDBMs to get to good decisions, then Big Data as a market would not exist. But it isn't and things like weblogs and twitter feeds - all examples for large sets of unstructured data, are extremely useful for data scientists to use to find decision driving data.

Well, what about video? Can digital video be useful for a query? Early indications are that it is and the more you think about it, the more you find good, solid use cases in the industry. IVS (intelligent video surveillance) is clearly an early adopter to use video. The more that intelligent, digital computer vision algorithms can be combined with queries of other databases to form a real-time, interactive query process, the more intelligible and actionable information we can glean from the query. Yes, there are computer vision algorithms that can search through a digital video file and detect anomalies. But despite the current speed, can we combine that search with a recognition database, like a facial recognition and give back intelligible information. What about height, speed, color, make and model or even an analysis of behavior?

What if an IVS system could automate a query that someone has entered a safe perimeter, that he is recognized as a known person on the terrorist watch list and is spending time cutting through a fence? What security level would this anomaly trip to get action at the highest level. Anyone who claims that any anomaly should trigger highest level does not understand the IVS market. After three or four false alarms, most companies turn off their detection and revert to good, old-fashioned human visual detection.

Big Data has this promise, but video is not an integral part of the query capabilities yet. There is nothing on the market today that allows companies to bring a digital video feed in and query it using SQL like any other data. But that could soon change (stay tuned).

Monday, September 23, 2013

Can we automatically index UGC?

Here is one of the largest dilemas under the current Big Data sun: will be ever be able to automatically index user generated video content (UGC)? Seems like a an impossibility, but would have great benefits if possible.

Why would we want to create automatic indices and meta data for the uploaded content? Well for one, it would make the mass of user created content that is uploaded to sites like YouTube searchable. Making this mass of video content searchable means we can find the content we are interested in, saving hours of sifting through content that is only saved by the good graces of someone who titled the video correctly or added descriptive meta data to their uploads.

Video is a content rich source of data, since we humans perceive much more data from video than could ever be tagged. But an auto-tagging system could learn over time to get better using facial and behavioral recognition patterns to improve accuracy. This would be an immediate and significant improvement over the voluntary methods we currently use.

The basis for starting automated indexing would be to bring the digital video stream to a standard format where it could be combined with other data sources and included in a standard SQL search. That is now possible, as you can see by my previous posts.

Wednesday, September 18, 2013

Why is transfer of lots-of-data an issue for Big Data?

This may just be a rhetorical question, but there are a few companies developing technologies that allow a massive amount of data to be transported to a DW environment where - using the latest Big Data tools - this data can be analyzed and further used within larger queries.

One such company is Attunity (www.attunity.com) and their latest product blog post states clearly there solution allows data to boldly go where and when you need it.

What I don't understand is this: Wasn't that the original use for MapReduce that data did not have to reside in a singular DW so that I could query it? As I understand it, this was Google's motivation to build this technology since they clearly understood that all data of the web would not be located on one spot.

According to a paper from TerraData there is a growing issue in the amount of data produced by oil wells and getting that data transferred across the country or countries to a data center is expensive. I guess my simplistic way of design is to ask: Why not put together a couple of servers at the drilling site, build a Hadoop cluster remotely onsite and remotely run queries on it where it is? Isn't it harder to try to store, transport and restore the massive amount of data?

Just a question. Please comment.

Thursday, September 12, 2013

Breakthrough: Hadoop-based Video Analytics

Pivotal, a company formed from EMC and VMware has created a breakthrough for video analysis. Dr. Victor Fang from their Big Data Science Lab has a demo where he chews through and analyses 5 GB of MPEG-2 video using Pivotal HD and HAWQ in near real-time. His output are exception images, place of movement, speed and many other parameters of the detection.

Why is this big? It will revolutionize security industry's Intelligent Video System market because it will learn to recognize exceptions and feed these to security personnel. Now, the system will intelligently provide information to the monitoring team, rather than hoping to detect with parameter movement algorithms, which have yet to prove themselves.

Watch this video:
http://www.gopivotal.com/resources#http://bitcast-a.v1.sjc1.bitgravity.com/greenplum/pivotalvideo/Unstructured_Data_Video_Analytics_on_Hadoop_Session_1.m4v

Monday, September 9, 2013

What are good Big Data opportunities that Startups should pursue

The question has come up in a great Quora forum, which I highly recommend and should get a number of members in this group excited about either joining or founding a startup. Most companies I speak with do not have the capability to analyze data like they could using the Big Data tools, the time difference between payoff and investment is far too long to warrant this type of investment.

So, a huge startup opportunity is in providing the information these companies would like to have - and based on some of these ideas, large corporations are willing to spend a pretty penny on these.

The four areas of that Startups could make money using Big Data that I find most promising are:

1.) Recruiting: this industry is still suffering from poor conclusions from unstructured data
2.) Online Video Processing: much could be machine recognized to enable indexing
3.) Knowledge Management: still a fleeting target for many companies
4.) Store Location Information: the right mix of parameters could be gold for some

Any others? Comments?

Here the link to the Quora forum: http://www.quora.com/What-Big-Data-opportunities-will-be-most-interesting-and-profitable-for-startups

Thursday, September 5, 2013

"We don't really buy hardware anymore,"

This quote from a Louisiana based healthcare staffing company. Five years ago, they would buy about 5,000 servers a year and now this year only bought one from HP. (see details in article below). This is a significant change that most IT workers out there had better listen to intently.

Who does this trend not effect? It does not effect the server makers as a total, since computing power is still going to be needed in the world. But it effects the data center workers who have just learned how to get everything working with their server farm. This will be a museum piece as more and more companies will put their workloads out to the cloud.

It is time for the IT department to start a plan to transition out of their blissful life of being the most powerful organization in the company to being an enabler of information flow with a budget of 50%.

CIOs who do not have a 5 year plan to decrease their budget by 50% and increase the IT functionality by 300% need to be replaced with those thinking outside of the box, who can efficiently meld the trends of this transition with the business needs of their company. This along with Big Data are going to be the greatest transitions we have seen in IT since the implementation of RDBMs. Like so many trends, many will be caught in the middle, unaware that things would change so quickly.

If there is anything we should learn from the trends of the past, everything will change much quicker than we thought it would. You can bet your career on it!

Monday, August 26, 2013

FT states it well: Big changes over the next five years

Please read this article:

Tech executives facing up to hard realities of the cloud

The FT columnist Richard Waters is finally describing the pink elephant in the board room of so many tech giants around the world. I have stated it publicly and will do so again here:

In the next five years we will see IT increasing its functionality by 300% while decreasing its budget by 50%.

The proof is in the pudding or better in the cloud. Countless executives from large IT users such as DHL have said it again and again that they want to buy computing services in the same way they currently buy power and water. DHL has stated again and again that they want to look at the cost of shipping a package as 100% and know exactly that the IT burden on that cost is 8%. I just made up that number.

But as you can imagine in today's marketplace this is not possible. Servers and mainframes bought years ago are going on refresh and enterprise applications are bloating the need for more storage. Yes, after the financial year is over, DHL can go through and make its calculations, but they are far too high and uncontrollable in a cycle such as this. They want what they get from the power company: if they ship a lot this month, they will pay a lot, but if they ship little, their bill should go down in its reflection.

Amazon gets it - maybe not through their ingenious foresight, but since they were in the right place to rent server capacity. Big server manufacturers, including the one I work for, do not.

Wednesday, August 21, 2013

Big Data Easy Start: Amazon Elastic MapReduce (Amazon EMR)

Amazon Web Services has a product/service that could make it easy to break into the world of Big Data in a real way. With Elastic MapReduce, any company can use their Hadoop framework and start building data-intensive tasks for applications without having to build the underlying infrastructure or tuning the Hadoop clusters. Not that tuning is impossible and products from Hortonworks make it all work easier, but you would need to build a lot of structure before you ever really see the results. This seems like a clearly easy way to pay your $90 and get full access to infrastructure and start web indexing, analyzing log files, mining data for results, traditional data warehousing, artificial intelligence through machine learning, financial analysis, biometrics research or even scientific simulation.

For many companies the start into Big Data is expensive, cumbersome and without any clearly foreseeable results. So many sit on the outside of Big Data and wait for the competition to make the first move. By then, unfortunately, catch up will be very expensive and hard to break into.

Should these projects prove themselves and result in real data that aids in decision making, then think about big vendor's tools and infrastructure. This is a great way to start.

Monday, August 19, 2013

Big Data growth in tools or in Information Mediaries?

Where is the growth going to be for Big Data? Where are the money makers going to be? As we look at the current market of companies putting out Big Data tools, all promising to make Hadoop easier to use and only garnering 10% of the up an coming bubbled market.

Personally, I know many people in IT departments and they are overworked getting the business projects they are working on complete and on time and do not have a whole lot of time or resources devoted to getting some new MapReduce tool to work in their data warehouse. I am certainly exaggerating and I know there are many working on Big Data in their IT department, but I think those of us who are chasing the money might be missing a very large development step in between.

Instead of Big Data tool maker selling its wares to a big company with a lot of data, what about the many companies who are putting the outputs of these efforts to work and selling the actual results of the data analysis. One such company is masFlight, who uses the platform they have worked hard to install as a service at an airline's site, to immediately analyze said data for the decision results an airline needs to have. masFlight is an excellent combination of smart data gurus who are taking their industry knowledge to create an industry solution. One step further is RetailNext, who does not even sell a solution platform, but instead the raw data output, primed for retail decision making. RetailNext claims to be the ultimate source for brick and mortar retailers in providing them with the in-store data they need.

So, to answer the Big Data money question ("Where is the money?"), we need to look not only at those companies making the tools like the Hortonworks and Clouderas, but also the information mediaries using these tools to provide the industry with the decision making data they need to get that competitive edge.

Will European Countries start with Data Protection schemes?

http://www.bbc.co.uk/news/world-europe-23178284

As we look at more and more governments admitting to data surveillance within the reach they have in their country, you will start to see more and more countries come up with protectionism laws in the hope that data will remain secure. As everything that any government does, expect this to be slow and very ineffective. By the time new legislation is constructed, designed and finally put into law, many moon cycles will have passed and we will have completely different concerns that are heavy in the press by then.

What I do love about this article is the clear and immediate condemnation across the world of the NSA only to be slowly followed by local European governments' admissions to either copying the same tactics as in France or heavy cooperation as in Germany. Opposition parties are quick to draw swords, but are not sure how they would react if in power during this post 9/11 era, where we are all enjoying a modicum of peace without global terrorist attacks.

Saturday, August 10, 2013

Truth exposed by Big Data: Humans are fundamentally of a greater order than algorithms

This is a bit of review of something that was explained in a presentation by a founding member of Palantir, Stephen Cohen at a Wired Magazine seminar last year. For the purpose of this discussion, algorithms are defined as a plan so well defined that there is no ambiguity to its execution.

That might need to sit for a bit, but Cohen explains it through history: Before the industrial revolution, every product was produced individually using an ad hoc method. Then came the industrial revolution where we learned to mass produce and therefore created algorithms, which consistently reproduced steps in a production process.

This is the fundament of an algorithm, which is the foundation of all computing process based on the confines of Boolean algebra (if then else). The algorithm is ruthless in its ability to repeat performance based on a set scale of explicit and upfront inputs. It never wavers and can do this repetitive task many times over.

Big data is the phenomenon of these algorithms to not only do their task, but put out information along the way. So, the big data concept is a phenomenon of algorithms self-propagating an information flow. Due to this recurring nature, the shear amount of data is exploding not only in size, but also in type.

So, as exciting as the algorithms are, what can’t they do?

Well as amazing as algorithms are, in that they make decisions without context for quality is exactly their limitation, which eventually bounds their potential. So an algorithms cannot treat or produce qualitative data like hunger, fear, happiness, etc. And it is exactly this qualitative data that we humans need to make decisions.

In order for algorithms to be able to make decisions, I have to strip the data of its qualitative nature and make it shallow and open for interpretations. I have to scale my hunger on a scale from one to ten, which cannot be done without ambiguities.

Algorithms also fail to capture subtle contexts, which would make their efficiency go away. Algorithms are efficient because everything fed to them has to be explicit and upfront. A complex human situation can be understood by humans, but very hard to communicate. Thus they are hard to break down to purely quantitative data and fed into an algorithm.

The final result of all this good stuff: Computers will never replace humans.

All of this information is copied 100% from a speech given by Stephen Cohen, founding member of Palantir. No original parts have been added and I take no credit for authorship.

Thursday, August 8, 2013

Perfect Marriage: CSC and Infochimps

announcement 240x240 Infochimps, a CSC Company = Big Data Made Better

More and more Big Data startups will be looking for an exit to their investments and this is a perfect example of things working out. How many times have we seen companies in a bubble all jump on a hot topic and get floods of VC dollars pumped in to them. Like in other hypes, there are literally 100s of startups all claiming to work to get 10% market share. Mathematically, many people will lose their shorts.

What I really like about CSC is their openness to the cloud and their leadership to take their clients into areas that need cloud. They have been doing this for years now, but with Infochimps, which was really not a tool, but a professional services model, there is a perfect compliment of capabilities. This is augment CSC to offer even greater services to their clients that will result in unique business value. Infochimps' technology will see client engagements they had only dreamed up previously.

Compliments to both management teams for having the foresight to complete these marriage!

More info:
http://gigaom.com/2013/08/07/csc-buys-infochimps-and-its-big-data-platform/
http://blog.infochimps.com/2013/08/06/infochimps-a-csc-company-big-data-made-better/

Germany invests 20 times more in industrially-relevant R&D than the U.S.

Germany invests 20 times more, as a share of GDP, in industrially-relevant research and development than the U.S.

Article and research is here

I know, I hate to bring up the obvious and politically motivated question, but just how much is our government doing to make the US more competitive in this world economy.

Merkel and Obama G20 Summit

Before everyone jumps up and comments about how our education system needs to change, let me save you some key strokes: I completely agree. The major change that is needed in the US is our education system no question. But education system changes will take a long time to take hold, whereas the immediate changes of our economy could be steered by our current administration.

To come to Obama's defense, everything the government invests in will not pay off, see the Solyndra. But what I wanted to bring attention to is the mere size of investments. When we compare ourselves to Germany, we are one 20th of their investment with respect to GDP. That means the immediate effect on the economy will not be felt nearly as much on the US economy and our workforce as it will in Germany.

I don't want to sound pro-German, since they do many things wrong as well, but the results are clear. When Germany has an economic downturn, there are far fewer unemployed workers in the market place then when the US goes through its wild swings like we did in 2008. Banking control? Please, that is why the good Germany banks have subs in the US, since they realize that fewer controls can make them all personally rich.

There is nothing wrong emulating what works in Germany. Let's study the way they fund innovation and find industries that need a boost to grow strong and keep jobs in the US. Not to hard to ask, is it?

Wednesday, August 7, 2013

Could computing could lose because of NSA? Oh please!

Cloud computing industry could lose on NSA security concerns

http://www.itif.org/publications/how-much-will-prism-cost-us-cloud-computing-industry

Somehow I find it fascinating that people can still spread the rumor that putting data in the cloud somehow makes that data insecure. Strangely enough, the only people propagating these false rumors are those who do not understand technology and still have not grasped what a network is. I try to educate these Luddites that the only way to keep data 100% secure is to lock it up in your house. This data could be on a server, but the server must not be on any network wired or Wifi. Most people think that if the data is in their four walls it is certainly secure, but fail to see that as soon as they hook up a simple connection to the Internet it is accessible by thousands.

But people fail to see that and do not understand that a good cloud provider would have security tools that will dwarf whatever they have put on their firewall and if they really want data to be in secure hands, choose a serious cloud provider.

This goes against our human nature thinking, which has been blown by invention of the transistor about 40 years ago.