(untagged)

Breaking Through the Analytic Limitations of Access and SQL

John_Joseph

1 Mar 2013

Breaking Through the Analytic Limitations of Access and SQL.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Overview

Although they are often a cornerstone of a company’s analytic toolkit, traditional databases, such as Access, and query tools, such as SQL, are designed for storage and simple queries, not for creating the complex analytics that are required by today’s fast-moving businesses. New technologies can help organizations get over the analytic limitations of Access and SQL, especially when dealing with the demands to process more data, change analytics more frequently, and make analytics available to more decision makers – all at greater speed than ever before. To experience the benefits of an agile, visual analytic environment over Access and SQL, you can download a free-for-life copy of the Lavastorm Analytics Engine

Business Demands Are Driving the Need for More Specialized Analytic Tools

Many organizations use using standard relational databases, such as Access, and script-based query tools, such as SQL, for analytics. Because of their wide availability, these products have represented for years the core set of technologies on which business groups have managed their business. But these tools were designed for data storage and for simple queries that are often inadequate to tackle the analytic challenges that most businesses face today. Today most organizations are relying more heavily on analytics to compete and the analytics they need to perform to improve business performance are forcing them to deal with:

More data – Difference-making business improvements require analysis of more data and more disparate data sources, especially sources that that come from different business departments and even from other businesses, such as partners, suppliers, and third party information providers.
More speed – Business reaction time must be faster than traditional IT-led business intelligence initiatives which can delay changes because of limited IT resources.
More change – A dynamic business world brings new data sources, relationships, competitors & risks.
More decision makers – More decisions mean more business questions, forcing IT to turn from data owner to facilitator

Limitations of Relational Databases and Query Languages

Access and SQL fall down in the face of the today’s analytic challenges because they were designed around a relational database structure that does not lend itself to today’s fast moving, ad hoc analytic environment. Relational databases, such as Access, were designed for storage, reporting, and simple queries and have the following limitations when it comes to analytics:

Table structure limits flexibility – instead of thinking directly about the analytic problem at hand, analysts using relational databases are forced to view the problem in terms of tables and to work with a complex collection of tables. But tables impose a structure that is fairly rigid and blocks visibility to the data and imposes a high overhead even for "small" changes. In some cases, if you want to add a new data source to your analysis, you may need to tear down the original structure and rebuild it to take into account the new requirements. Tables also have size limits (2 GB for Access) that create a ceiling in terms of the amount of data an analyst can consider. If you come close to the ceiling, performance will degrade quickly and you will have to divide data up in less logical chunks and restructure your queries based on the different data divisions to perform your analysis.
Query interfaces are either too simplistic or too complicated – Databases, such as Access, offer wizard-based query tools (Access offers Expression Builder) , which limit the analysis of complex expressions, or script-based query languages, such as SQL, which increase the overhead associated with queries and limit the query capability to very technical analysts and programmers. SQL is also difficult to debug compared to more visual representations of analytic logic. There is no middle ground that serves the need for more managers and analysts to have free-form exploration or discovery capability – a critical analytic technique for ad hoc analytics and for projects where you are trying to identify root cause analysis and investigations require a long sequence of queries.
Data modelling creates chick and egg problem – One of the major characteristics of analytic projects undertaken with relational databases is the need to create a data model reflecting the relationship between all the different tables. This is a time consuming and difficult task and like the table structures themselves, it is essentially creating a rigid structure not suited to changing business requirements. When it comes to the success of an analytic project, this is often the hurdle that causes most people to stumble because building the data model presumes that you know all the questions that will be asked of the data set, but you usually don’t know which questions you will want to ask until you have built the data model on which the analytic application is built.

New Technologies Are Designed for Analytics, Agility

Today there are alternatives to relational databases that are designed from the ground up to address the analytic needs for more data, more speed, more change, and more decision makers. While several technology approaches exist, they all share these characteristics:

A flexible data model – No over-arching data model, or schema, is required so you can build your analytic incrementally and immediately without waiting to gather all possible requirements. This eliminates the overhead costs and effort associated with the rigid structure of a traditional relational database and makes it easier and more practical to unify highly diverse data sources or change an existing analysis.
Speed appropriate for rapid prototyping – The more flexible data model allows you to mash up data and build analytics very quickly compared with traditional database and business intelligence tools. This makes it more practical for you to consider new data sources, explore data more completely, evaluate new hypotheses, and make ad-hoc discoveries.
Easy to use query interfaces – Query interfaces can be visual in nature or search-based allowing not just programmers, but business analysts to investigate hypothesis and make discoveries. Visual models have the added benefits that they can make processes more consistent by visually documenting the step-by-step process, they can be used to automate analytic processes, and they allow rapid, pinpoint changes.

The Lavastorm Analytics Engine – An Agile Analytic Environment

The Lavastorm Analytics Engine is a new breed of analytic technology that provides an agile way to analyze, optimize, and control both your data and your processes. Over the past 15 years, the product has generated billions of dollars in business value in demanding analytic environments where organizations needed to handle extremely high data volume, integrate very dissimilar data, and respond in near-real time.

The Lavastorm Analytics Engine’s visual, discovery-based environment allows organizations to bring together diverse data with tremendous ease, helping them reduce analytic development time by 90% or more, and use the analytics to transform their business. The engine combines two unique capabilities that allow you to capture the potential value from the data:

Data discovery – You can combine large volumes of diverse data and explore it freely using a visual analytic development environment. Data discovery allows you to identify new insights or to use the enriched data to make better-informed decisions.
Continuous analytics – You can continuously run the visual analytic models that you create with the engine, allowing you to automate various analytic processes, such as data cleansing and data quality processes, and business processes.

Along with these capabilities, the Lavastorm Analytics Engine is offers advantages over Access and SQL for analytics when you have the following business requirements:

Business Requirement	Lavastorm Analytics Engine Solution
Analyze Federated, Disparate Data Sources	Acquires virtually any source data, including Access and other databases – no overarching schema required, just join and go No need to homogenize the data first Low overhead of data integration allows you to start immediately, change your analytic application and add new data easily as you see partial results Allows you to design, test, debug and deploy sophisticated analytics up to 90% faster than using traditional tools Combines siloed data without requiring SQL coding "Fine" analytic control allows you to profile, inspect, and transform data any way necessary Automates processes, including data quality processes Visual models expose analytic logic and create an audit trail, building trust with managers/peers
Complex and repetitive data manipulations	Visual model simplifies explanation of analytic path from start to finish Captures logic for future reuse, automation Self-documenting for repeatability, optimization Drill-down to examine details Does not require analytic off-shoots, such as temporary tables and table indices, that add to analytic overhead Drag and drop model modifications Permits examination of data at any step within the analytic, allowing you to answer, "Where does that figure come from?" Builds confidence in analytic insights and results Traceability and visualizations engender trust and better understanding Delivers up to 40% more accurate audits
Ad-hoc questioning for discovery, root cause analysis	Provides the speed and flexibility needed for data discovery analytics Schema-less approach Enables iterative approach, prior to development of all tables – accelerating time to results Allows new data sources, data fields to be brought in at any time without high cost/overhead Visual development/analysis Provides accurate view of logic for complex analyses Reveals exceptions/outliers, new insights Allows business users to develop their own analytics or collaborate with IT, leading to faster, more accurate results Enables fast development of conditional branching logic
Orchestrate systems for data unification, manipulation, and result publication	Lavastorm coordinates the data flow between systems to model the business processes and publish results Automates multi-step operations and processes, such as repetitive Excel manipulations or data cleansing processes Visual model captures the organization and sequence of system interaction, eliminating the need for manual tracking sometimes used with SQL
Capture, share, and reuse analytic logic	Visual components are easy to read and communicate Analytic models or model components can be reused and shared Trace Field allows you to pinpoint model components that are affected by changes
Scalability and performance	Lavastorm does not have the table limitations of Access that lead to compromises, additional table manipulations, inaccuracy and poor performance Lavastorm can perform complex, multi-step analytics over large data sets by harnessing the power of multi-core processors Analytics developed with desktop or personal editions of the Lavastorm Analytics Engine can be run on servers or server farms

Figure – A visual analytic model created with the Lavastorm Analytics Engine

Figure – Graphical business controls capture analytic logic and allow you to acquire data and assemble analytic applications quickly and easily

Figure – Analytic logic can be stored in composite "nodes" for sharing and re-use.

When to Use an Agile Analytics Solution Instead of Access and SQL

The advantages of the Lavastorm Analytics Engine can help any business where:

Business users need more control over their data
Business rules and logic are branched, conditional, or, otherwise, complex
Business processes change frequently
Data changes regularly and new data sources need to be analyzed
Repetitive Excel calculations exist that could be automated
Analytics that span departments are needed to improve business performance

Get your free copy of Lavastorm Analytics Engine

To experience the benefits of the Lavastorm Analytics Engine, download a free-for-life copy of the Lavastorm Analytics Engine, our powerful, visual and easy-to-use tool for anyone combining and analyzing data. The Public edition is a free version of our award winning engine and it allows you to harness the power of our enterprise-class analytics engine right on your desktop. It will help you:

Get more productive by reducing the time to create analytics by 90% or more compared to underpowered analytic tools, such as Excel or Access
Stop flying blind by unifying data locked away in silos or scattered on your desktop
Eliminate time spent waiting for others to integrate data or implement new analytics
Gain greater control for analyzing data against complex business logic and for manipulating data from Excel, CSV or ASCII files

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here