Accenture Technology Labs Blog
Bold thinking, commentary and application of new technologies that address many of the key business challenges facing organizations today.
Teresa Tung, Ph.D.
Apigee kicked-off its inaugural “I Love APIs” conference in San Francisco this first week of November; the event was oversubscribed with 800 attendees. Accenture made a large presence demonstrating just how much we love APIs: We had leaders from across the firm including Mobility, Communications Media and Technology, Emerging Technology Innovation (ETI), and of course TechLabs who delivered key talks and Internet of Things demos.
Apigee CEO Chet Kapoor kicked-off the conference with a familiar theme “Every Business is a Digital Business” from our very own Accenture Technology Vision 2013. Labs’ Mike Redding’s keynote brought this message to life: that the concern around APIs is not just for “digital natives” but may even have broader impact for “digital immigrants” like those from retail, healthcare, and fleet management in keeping their businesses relevant and helping them tap into new capabilities enabled by analytics and connectivity.
To enable the digital business, my panel discussion “From Projects to Products: Innovation in a Digital World” explored the importance of thinking of the API as a product. Not contained to the realm of geeks, but today’s APIs are business-level artifacts that help connect new partners and unlock efficiency and innovation.
I shared an example from the telecommunications industry where we built an API for device activation first targeted at retail partners – so when you buy a phone at a store, the retail partner can activate connectivity on the spot. We initially focused on this partner on-boarding use case to gain excitement about the possibility of the API which allowed us to speed-up the partner integration process from 3-5 months to an order of days.
But the same on-boarding API may spark other innovations. When someone buys a phone from Craigslist or Ebay, external developers can use the same API to activate the device. Or device activation also applies to the M2M (Machine to Machine) based communications in the Internet of Things. The point is to pick a relevant use case and get started developing an API product, and then build on that success.
Getting started with APIs was covered in the “API Journey: A Story of Technology, Organizations, and People” keynote panel. Here ETI’s Adam Burden identified the confusion between the SOA and APIs as a key question he faces from many of our clients many of who have made significant multi-year SOA investments. SOA and APIs are complementary technologies where SOA focuses more on integration whereas APIs are that product in the digital business.
John Elliott in the closing keynote panel “Decision Making for the Digital Transformation” addressed how the move to digital changes traditional product strategy to an ecosystem based perspective. John brought up the Phillips Hue programmable light bulb as an example where with APIs even a decision around capturing value of the Hue becomes more complex: Do we monetize at the point of sales of bulb? Or assess as a driver of brand awareness? Or value as an enabler to play in the digital ecosystem? The API brings about the marriage of the physical and digital footprint which in turn unlocks an ecosystem strategy based on allowing others to tap into programmability and data.
What is clear from this conference is that APIs are a key component of digital transformation for many businesses. API success is not ad-hoc and it is not just about technology, success elevates the role of the API as a product that drives new business outcomes. APIs unlock new opportunities around efficiencies, innovations, and ecosystems.
Daniel S. Kaplan
Nearly everyone in the Wearable Technology community is excited about Google Glass. Glass is “making the pie bigger” for the wearable tech market by bringing some much needed attention to a previously underexposed field. Google might not be the first to proclaim “Eureka!”, but they certainly have been the loudest. Developers can’t wait to get hands-on with this fascinating new technology.
The problem is no one knows what to do with it yet.
A few key points to make about Google Glass:
- Glass is a device for updates - Glass natively operates similarly to the way your phone synchs to an email account or to a Twitter account. Updates arrive irregularly, and the existence of a new message on the server does not translate directly to a notification on the phone. The Glass must synch with the account for the notification to arrive – there is no real-time data.
- Glass is not [natively] Android – Google has allowed and even encouraged the development community to develop for Android on Glass, but functionality is still limited. Android applications must be actively turned on via a tethered computer with the Android Debugger or through hacks created by the community, (i.e. Launchy).
- Glass was built for the Mirror API – The size, functionality, and relative stylishness of this always-on device are incredible – but at the cost of battery life. Android will become viable for this form factor only after battery performance improves. For now, as it minimally consumes battery life, the Mirror API can be a powerful tool in the right applications.
Developers in the Glass community need to rethink the standard mobile model and use Glass for its intended application – in conjunction with the Mirror API. Build server-side applications which monitor real-time processes or events and subsequently notify the user of important aspects. Given this architectural model, the most suitable use cases for a consumer-based Glass application are those which enhance a wearer’s current activities or prepare him for upcoming events.
A few scenarios that come to mind:
- A fantasy sports score update while watching a game
- A notification for a meteor shower overhead
- A traffic update for an upcoming trip
- A contact info card for the current speaker at a conference
- A “food ready” notification from a cooking app
Yes, Glass is limited in its functionality. However, it is unlimited in its possibility to personalize and augment the world around us.
I was recently a guest speaker on a panel at Twiliocon titled “A New Breed of APIs: What Your Company’s API Strategy Should Be” moderated by Mashery and alongside speakers from Capital One and ESPN. Together we were charged with addressing what it means for an enterprise to build an API and open that to the developer community.
Many of the popular web APIs of today like Twilio, Google, and Twitter are meant for mass adoption. The conference host Twilio maintains a communications API platform that allows any developer to integrate communications capabilities like text messaging and voice over IP into an app. And Twilio has a successful API strategy based on monetizing the API usage made by those developers. But how does this model apply to more traditional enterprises whose core systems may not be of interest or suitable for mass adoption?
Our panel explored the role of the enterprise API that harnesses the flexibility of software and the power of the developer applied to core business functions. Enterprise APIs target internal and partner developers (rather than mass adoption) and are based on strategies that promote agility, innovation, and scale. For instance Accenture’s time report APIs are not going to be generating interest from independent 3rd party developers, but are very useful internally for allowing our tech-savvy employees to embed this capability into apps beyond what our CIO can accomplish on its own.
Like with the public API, the role of the developer is the key to a successful enterprise strategy. Whether external or internal facing, API success relies on harnessing the power of the developer to quickly get things done and to imagine new applications. It starts with creating APIs that are simple to understand and with compelling targeted functionality (which is a different breed than the service oriented architecture based services that need to handle all the variations of complex logic a business needs to fulfill). A developer should be able to understand and begin using an API in the order of minutes.
And it requires promotion and support. For example, ESPN hosts an internal hack-a-thon to promote awareness of the functionality for their developers to create new apps. The key to success relies on generating excitement in the developer community and enabling a self-service model that helps the developers help the enterprise succeed.
While the enterprise use case may have different goals from external-facing APIs, success requires a product-centric approach that focuses on supporting the developer as the API consumer. Twiliocon was a celebration of the developer—starting with the hands-on coding workshops, to the Hacker Olympics, and a keynote speech that featured the Twilio CEO Jeff Lawson writing and deploying Python code. It is recognition that with APIs, developers are the rock stars and the key to success lies in how to incentivize them to get things done.
Imagine you walk into a store to buy grocery. You are standing in front of the Yogurt aisle thinking about what brand of yogurt to pick. You notice a QR code on the aisle, you scan the code. A customized coupon is shown to you for a yogurt available on the shelf. The coupon has been generated by an intelligent platform that knows your taste, budget, history, etc . You can decide whether or not to buy the brand and use the coupon.
Looking behind the scenes, the platform has had a significant impact on all parties involved:
By knowing you, the customer, it has provided you with a customized offer that you are likely to enjoy and benefit from. As in, if you are an organic buyer, the platform knows and the coupon reflects that, or if you have never purchased a yogurt before, it can predict what you might like based on your taste profile and that of other customers. In short, it has provided you with a self-initiated customized hassle-free point of decision interaction. You don't have to spend your time clipping coupons, looking for coupons online prior to going shopping, or hanging on to the coupons they give you after you checkout every time.
The yogurt manufacturer has managed to capture the attention of a person interested in buying yogurt, at the point of decision. The platform provides aggregate anonymous data to the manufacturer, so they can accurately measure the effectiveness (ROI) of their different promotion campaigns at the point of sale, in terms of resulting in a purchase or better yet, resulting in loyalty conversion on the long run.
The retailer has managed to provide a better experience for both customers and suppliers and to monetize customers' attention, and its in-store foot traffic. In particular, the platform provides an automated mechanism for the retailer to charge suppliers for showing their promotion to interested customers at the point of purchase. Retailer can measure how many of the promotions shown to customer resulted in purchase, both in the long term and in short term, and thus charge the suppliers accordingly per-purchase or per-loyalty conversion.
Accenture is well positioned with a patent-pending technology to bring this to reality in the near future, to a store near you!
This post described one possible realization of this idea. The same concept can be delivered using non-QR code alternative technologies such as Bluetooth Low Energy (BLE), NFC, displays, scanners, etc. Read the point of view to find out more and contact us at Accenture Technology Labs to discuss this further.
If you read my previous three blogs about log content analytics you may have thought that I was done espousing my views. For better or worse, I have more to say. First of all, I lied about it being a three part series. It is actually a 4 part series now, you can sue me if you like. Although, I don’t own much so that may be a futile practice :)
Continuing on from the previous thread of the past 3 blogs I wrote, you may be scratching your head asking what can you REALLY do that goes beyond vendor tools? In this part of the series I will show you an example of a log content analytics application that we have created at the Accenture Technology Labs, how it sits in the vendor eco system, and how it builds on top of existing vendor solutions to create new functionality.
First off why is additional insight needed? As the IT Operational Management field grows, so too does log file management. With the field growth, comes an increase in spending. Log file management spending steps in line as log files increase in size. As the cycle continues onward, enterprises must spend more money to upgrade their infrastructure to accommodate the influx of information generated in log files. As these log files increase, it becomes more and more difficult to parse them, find errors, track issues, and particularly so when cross log correlations come into play. To this point, we identified opportunity areas for enhancements in ingestion and parsing, analysis & exploration, and visualization. Using our domain knowledge and expertise we chose to focus on the analysis and exploration piece by providing a vendor agnostic way to extrapolate information from log files that required little input from the end-user.
We developed an asset framework that we dubbed “SMART Log AnalyzER” or “SMARTER” for short. The framework provides enterprises with a systematic and effective mechanism to understand, analyze, and visualize heterogeneous log files to discover insights. What we found when analyzing log files is that many of them are really transaction logs.
What does that mean? That means that the log files contains information that can be used to uniquely identify events, event probabilities and statistics, and discover the temporal relationships between events. Additionally, that means we can mine it for information and pull out a graph that depicts behaviors. In doing so, we created algorithms to discover temporal causality relationships. Having done this exercise again and again on specific data sets and for specific clients, we strove to create an asset that could be used in a pipeline on various data sets without customization and without worry concerning the underlying vendor infrastructure, be it in an onsite data center or housed in the cloud.
The SMARTER pipeline is a several step process. Log files are ingested into a vender solution where the files are indexed and from there the SMARTER application acts as a wrapper or layer above the vendor tool where it performs the task for log file extraction to a vendor agnostic format. This vendor agnostic format is then piped into our algorithms where it mines and extracts the temporal causality relationships of trace events by treating the log files as a transaction log. Trace entries are linked together by discovering the unique identifier for a sequence of events. Additional statistics are mined that allow us to predict the likelihood of events following each other in time. This data then can be used to seed real time analysis for anomaly detection and pattern recognition.
What is really fascinating is how we have been able to use data from large media companies to infer demographics, gender groups, political affiliations, and so on. The discovered data in turn can help marketing campaigns properly target audience for maximized profit with minimized effort. Additionally we have shown using the same algorithms that we can detect anomalous patterns in the data present from logistic companies as packages and shipments traverse through their transportation system, allowing for correction and thus avoiding costly misplaced shipments. Further exploration also demonstrated the tools capability for pulling out anomalous network events from data transmission logs. Examples of the output are shown in Fig 2 as a screenshot.
Fig 2. Screenshot of Trace Sequence Analysis Output
To aid the end user seeking the insight we also developed filtering capabilities to drill down and really see what is going on without all of the noise of a large mined graph structure. Interactions of events within a log file also get clustered together as their relationships strengthen over time. If at any time events are filtered out due to user selections, they can be toggled for viewing and explanations as to why events are filtered or considered anomalous is easily obtained by clicking on an event in a graph within our tool.
Great! So how do you use it? The steps are simple:
- Select the input
- Select any parameters to filter out particular event views, a purely optional exercise
- View the output
- Gain insight!
It is as simple as that, it was designed to be a light weight framework that extends the functionality of existing vendors! Our framework was also designed to run on copious amounts of data and scale elastically. If you are at all interested in the concepts of log content analytics and how it can help you, reach out to us on the Data Insights team at the Accenture Technology Labs and we can show you how log content analytics can help you gain more insight into your operations.
The future of log content analytics relies heavily on how extensible the solution platforms and frameworks are that exist. Naturally, this raises question of what insights many may find useful and what can be garnered from log files. The near future for log content analytics lies in creating algorithms, platforms, and frameworks that support the following:
- Vendor agnostic views of the underlying data
- Parallel execution of tasks that scale elastically and horizontally
- Discovery of correlations within and across log files
- Semantic understanding and linking of events and concepts
- Machine learning for anomaly detection and error tracking in both real time and offline modalities
- Discovery of relationships and trends without knowledge of the underlying data
- Advanced recommendation systems for cleaning and presentation of log file contents
Whether or not vendors see a need for certain machine learning or mining approaches and what value is there in the insights beyond parsing, storage, indexing, searching, alerting, and dash boarding of log file contents only time will tell.
What log content analytics is today will change and go beyond the current offerings that only provide parsing, storage, indexing, searching, alerting, and dash boarding. Deploying a solution platform that allows for the customization and implementation of machine learning algorithms will ultimately prevail. What can machine learning be used for? It can be used to discover patterns, detect anomalies, find interesting facts about customer data, pull out trace correlations both within a log file and across log files, and discovery of data trends. To accomplish this, platforms and solutions need to provide: technical and economic scalability, advanced ingestion and parsing, analysis and exploration capabilities, and enhanced visual expressiveness. Beyond these attributes, the solutions of today must allow for the implementation of custom algorithms and in the future must start looking at trace mining, correlations, advanced pattern and anomaly detection, trend detection, and recommendation systems. Log content analytics can be as broad or as scoped as needed, but for growth an enterprise needs to look at the future and choose the solutions with the greatest amount of extensibility.
In my last blog, I wrote about the importance of using logs to gather insights. In this edition, I’m exploring how Log Content Management solutions aid in gathering insights by performing aggregation of log files from disparate sources, indexing them, and making them searchable. Every solution offers a slightly different approach and not all solutions are created equal. Ultimately, the decision about which solution to choose is based on the types of log files that need to be indexed and the type insight that is desired. Some solutions provide a simple aggregation service, others provide additional statistics, some provide more advanced connections and dashboards, while others enable other analytics and beyond. Analyzing log files depends on what information is contained therein and what you can do with the information at the very least is enabled with the right choice of vendor platform or support.
There are many vendors out there and they fall into two main categories: free public-use solutions and supported enterprise solutions. Beyond the two main categories are various levels of service and availability that vendors provide. The core facets to focus on when acquiring a solution for log management and log content analysis are:
- Scalability: Almost all of the vendors provide a solution that can scale with large data; however, they differ in their approach. The following are scalability issues that should be considered:
- Technically scalable:
- Can the solution deliver on its technical promises and does it scale horizontally?
- Economically scalable:
- The economical scalability of a solution is also paramount for an enterprise looking to keep a tight rein on its expenses. A quick question would be to ask is how expensive is it to run the solution? What is the cost of support and is the solution elastic? For public use solutions, what happens when support is needed? For pay for use software, what are the costs of ongoing support from the vendor?
- Ingestion and Parsing: This can be the thorn in the side of any who have tried to ingest abnormal or odd ball log file formats. The ability of a solution to handle numerous log files can mean the difference from garbage-in-garbage-out and getting real insights from indexed data. To that end, a solution must be able to handle the following three issues gracefully:
- Ill-formed and poor quality log file inputs:
- Any viable solution must be able to handle log files with dirty data or poorly formed data values and do so gracefully (generating notifications, re-formatting data, normalizing data, cleaning data, etc.) while still extracting information from the portions that are of high quality. It must do so without propagating data errors and ill formed data to other systems that interact with its data store.
- Unknown log file formats:
- Many file formats are proprietary and may not follow the well-defined formats of CSV files, tab delimited files, Apache web logs, etc. A good solution will provide a wizard or a guide for first time file ingestion, pattern extraction, and a mechanism to allow for extracting of log files that are too difficult to fit a known template or follow through an ingestion wizard/guide.
- Heterogeneous log file formats:
- In some instances, a log file may be well-formed and structured, however is difficult to parse due to its diversity. A solution must allow for a mechanism that is expressive enough to parse a diverse file and complicated trace entries.
- Analysis and Exploration Capabilities: The core functionality that allows and enables log content analytics as a solution is the capability for searching and pulling out the right data at the right time and displaying it in the right way. A solution must allow for the following:
- Data exploration: Existing data may be arcane, archaic in its layout, domain knowledge about it is scarce, or documentation may not even exist. Therefore, formulating the right query is challenging due to limited semantic understanding of the log content. A log content analytics solution must provide a mechanism to allow an end-user to understand the data.
- Exploration guidance: Lacking domain knowledge concerning a data set can lead to difficulties in ascertaining what insights lie within a data set and how to extract the insights. As a result determining what analysis to perform is non-trivial and leads to missed insights or increased time to discovery. An ideal solution should provide a mechanism to not only explore the data but also provide suggestions on where to look and what queries should be formed.
- Query expressiveness: While exploring the data is critical, being able to succinctly and precisely express a query is important. If a query language is arcane and difficult to understand then it can inhibit the discovery process. The query language must be able to allow for complex questions but also to be elegant and capable of pulling out information from heterogeneous sources. It should be easy to debug if there is a problem in the structure of the query.
- Visualization Expressiveness: In addition to all of the other challenges of log content analytics, often knowing what visualizations to use can flummox many. A viable solution must provide a means to visualize the data whether be it in a line graph, bar chart, pie graph, etc. In addition to providing dash boarding capabilities an optimal solution should also provide steps toward or provide a mechanism for the following:
- Visualization optimization: A solution needs to provide an end user with feedback as to which visualizations are best for certain sets of data. Understanding that while a pie chart, for example, can be used to display information about a series of numbers, a line graph may be a better representation. This type of guidance can make dashboards come alive.
- Preprocessing/Post processing: Some visualizations may require additional pre/post-processing of the source dataset which adds significant human overhead. For example, if a log file has raw sensor data, cleaning and smoothing may be required to remove aberrations to improve the signal to noise ratio (SNR). Failure to do so would result skewed visualizations and other undesirable results (e.g. pops, hisses, spikes, etc).
My next blog will review what can be gleaned from log content analytics and the possibilities I see in the future.
Nathan Shetterley, R&D Manager – Accenture Technology Labs
Now you can access the Accenture Technology Vision on the go from your smartphone and tablet (iOS and Android). In keeping with our theme of Every Business is a Digital Business, we have created an app that enables you to explore the Vision and its key trends through videos and reports, as well as access related content. Download the app today.
There are tons and tons of log files that reside in an enterprise. The old adage, one man’s trash is another man’s treasure, especially holds true when it comes to log files – administrators may see a bunch of junk, but the data team sees a treasure trove of scrumptious data “noshables”. It is all about the context. In other words, if you want to gain insight, don’t throw it away!
So, how do you get insights from log files? Content analytics. In its simplest form log content analytics is the science of making sense of computer-generated records. However, it is much more expansive than that. Log Content Analytics is the application of analytics and semantic technologies to (semi-)automatically consume and analyze heterogeneous computer-generated log files to discover and extract relevant insights into a rationalized, structured form that can enable a wide-range of enterprise activities.
What can log content analytics be used for? Simple, log content analytics enables the following:
- Audit or Regulatory Compliance – The goal that corporations or public agencies aspire to in their efforts to ensure that applications, for example, adhere to relevant laws or regulations.
- Security Policy Compliance – The adherence of individuals and applications to the policies of a company that ensure protected access to assets.
- Digital Forensic Investigation – The investigation of details and tracking down of application traces as it leaves footprints of its operation.
- Security Incidence Response – Monitoring security violations that may be present in alert logs.
- Operational Intelligence – Business analytics that deliver visibility and insight into business operations often in real-time.
- Anomaly Detection – The detection of patterns in a given data set that do not conform to an established normal behavior.
- Error Tracking – The detection of error messages and alerts.
- Application Debugging – The process of debugging an application though the use of trace logs.
Understanding the definition of log content analytics is but one part of the puzzle when understanding what lies within log files. This leads to the next question, how is log content analytics generally performed as of today? In general there are six basic steps in the process of log content analytics extracting the information and utilizing it.
File selection and ingestion – During the selection and ingestion process, log files are selected and consumed. This may be done either manually or through an automated process via a vendor tool. It is at this stage where tools aggregate log files from many sources into a single point of access.
Parsing and extraction – Log files are parsed and relevant features and values are extracted. This is the most critical stage that enables of the following steps without which storing, index, analysis, visualization, and publication would not be able to take place.
Storage and indexing – Once a log file is parsed/extracted it is stored and its contents are made index-able for the purpose of searching and querying of information. This is the second most important phase as it enables the analysis and exploration phase.
Analysis and exploration – This phase is where a user or administrator interacts with log files, generates queries, analyzes the results, and iterates until the desired information is discovered.
Visualization – The visualization phase is where the data starts to come to life with bar charts, line graphs, and so on for later use in dashboards. This phase is important for properly communicating the information contained in log files in a succinct and impactful manner.
Publication and usage of results – The last stage where individual visualizations are collected into dashboards to gain actionable insight and gathered information may be pushed to other destinations for consumption.
Stay tuned for my second blog where I’ll share my thoughts on what makes a good vendor solution for log content management solution, and how these vendor solutions can be used as a starting point for log content analytics.
Who does LeBron James assist the most? Let’s pretend for a second that you know nothing about the NBA. Not that hard for some of us. How would you have answered this question? Try googling it - and I’m sure you’ll eventually get to the answer, but it’s difficult to find. Business leaders face this challenge in answering their questions every day. For example, when do my east and west coast offices collaborate well and how can I encourage more collaboration?
(The answer to the first question is Chris Bosh, by the way.)
At Accenture Tech Labs we set aside time for our researchers to explore new technologies via their own interests and, sometimes, those explorations grow into projects. Recently, one of our researchers wanted to showcase how fast you can turn data into insights with the right team of data scientists, developers, and data artists. But what data set to use? For that, we returned to our basketball question above and chose the 2012-2013 NBA regular season.
Today, we’re proud to announce that one of these bottom-up projects was implemented last month and we want to share it with you. You can explore the results yourself at http://hotshotcharts.com. We also have a video walk through.
The Basketball Data Insights web app is a data exploration tool built on open source technology that we like to refer to as Hotshot Charts. It allows you to easily explore whom your favorite NBA player assists and where he shoots from, with what accuracy.
Sports fans (fanatics?) aren’t the only ones who are looking to do more with data. Technology is moving analytical capability and data visualization closer and closer to business users who are asking their IT departments to help derive more and more value from their data. It is important for IT professionals to understand all of the options in their toolkit. Whether it’s custom development, via existing tools such as Tableau, QlikView, Spotfire, or emerging tools like Platfora, our Data Insights R&D team believes that businesses that enable conversations with data will have a competitive edge.
We invite you to have that competitive edge over your friends in debates about who will come away with the 2013 NBA Championship via our Hotshot Charts.
The Accenture Technology Labs blog will feature the opinions and perspectives from the very people that are driving innovation today for Accentu...