RSS

Latest Blogs

Beyond Transactional Consistency

Software has been growing organically for years, now it's time to scale it.

Evolution of Parallelism: Part 2

New query planning and optimization techniques are needed to deal with distributed, independent and heterogeneuos data sources.

The Evolution of Parallelism: Part 1

The amount of digital information created and replicated in the world will grow to an almost inconceivable 35 trillion gigabytes by 2020. New tools are needed to manage and analyze the vast amounts of data.

A Fresh Approach to a Big Data Problem

Lightwolf recognized that in today’s Internet age, the future of data management lies in distributed computing. Distributed data processing is the only viable solution for handling the massive data sets in common use today. As the proliferation of data and users of data continues across the enterprise, the requirement for storage, bandwidth, and computational capacity to query and analyze these datasets is growing at an unprecedented rate. In an era where a one terabyte hard drive costs less than $100, the complexity of dealing with the amount of data, data types, and data analysis is only going to increase.

This serves as the backdrop for horizontal scale out. It also provides the answer to the question of what is useful from the cloud and what cannot be ignored from enterprise software.

Simplicity without Compromising Function


  • Massively parallel SQL and MapReduce SQL processing
  • Use of standard hardware
  • ACID transactions
  • System-level data consistency
  • Massive replication and sharding
  • Multi-level partitioning
  • Online capacity additions and migration
  • Storage redundancy
  • Geographic redundancy
  • High-availability, non-stop operations

Web 2.0 and SaaS

The ability to easily add customers, support new features, improve quality, while decreasing costs and promoting ease of use is the key to growth for any company. In the case of Web 2.0 and Software as a Service (SaaS) providers, as the organization grows and matures, pressure occurs on a number of fronts beyond capacity. These include the need to move infrastructure from “web-quality” to enterprise quality; extend applications with additional features; provide non-stop 24x7 operations; and reduce the cost of customer acquisition and ongoing support with self-service automation. The database layer is at the heart of a business’s ability to scale application infrastructure and operations.

Over the past several years, there has been growing realization that horizontal scale out is the only cost-effective scalable solution to support ever increasing business continuity and capacity requirements. Horizontal scaling provides an economical approach for improving the throughput and performance of business applications. This is true for social networking, Web 2.0, and SaaS companies experiencing significant data management problems as a result of customer growth and particularly true for the most successful companies experiencing hyper growth.

Live Business Intelligence: Large Enterprise

Large enterprises are dealing with an ever growing need for analysis around large data sets, more data types (structured and unstructured), varied data sources (mobile devices, consumer, intra and inter enterprise) and more users (technical and non-technical, internal and external). The problem of extracting useful information from data has become more essential while at the same time getting more complex. Whether it is business performance management, CRM analytics, supply chain analytics, risk & security management, web data analysis, production planning, services operations, or workforce analytics, the end-goal is near real-time business intelligence to “connect the dots” for operational decision making. Waiting a couple of hours for batch loading is no longer acceptable, the time between when information is created to when analysis is presented to the decision maker needs to decrease from hours to minutes or seconds.

Web 2.0 Style Presentation of BI Data

Business intelligence is valued by an increasing number of users who need to easily view, interact with and analyze information that is relevant to them. These users want customized analysis on demand and want more than conventional offline reporting and slicing/dicing favored by analysts. BI is returning to its decision-making support roots while leveraging Web 2.0 style collaboration and presentation to support this new class of users.

Big Data

It is not unusual for a large enterprise to ingest 50GB or more of new facts/day which translates to 5.8 thousand new facts being inserted/second. This is with the unrealistic assumption that the database can be level-loaded throughout the entire day. The real world has peak loads which means that the database solution needs to support peak loading of 50k facts/second. There is no longer a good time to stop loading and perform analysis. Larger numbers of users want on-demand analysis of live data. In order to reduce the analysis time from hours to minutes or seconds, the size of the data set requires parallelization.

Big data requirements have led a number of enterprises to look to the cloud at solutions from players such as Amazon, Facebook, Google, Yahoo, and others. These companies have leveraged MapReduce and related software such as Hadoop to automate the parallelization of large scale data analysis workloads and support extreme volumes of data across a federation of commodity servers. However, solutions such as MapReduce must be improved upon to make it suitable for the enterprise. Although support for unstructured data is provided and brute force analysis is performed well, these solutions are not suited to traditional workloads from operational data stores where data is more structured. MapReduce does not easily interface with existing databases and it was never intended to be used as a complete solution over structured data.

Lightflow from Lightwolf Technologies

Transactional Big Data for the Cloud and Enterprise

The Lightwolf team recognized the need to bring together leading-edge innovations from the cloud and relational capabilities from the enterprise to deliver a complete horizontal scaling solution for Web 2.0, SaaS, and BI applications. Lightflow from Lightwolf Technologies provides a powerful transactional data management solution for customers who need both horizontal scaling and relational database capabilities. It supports breakthrough performance and analysis of queries for large structured and unstructured data without sacrificing ACID transactions, high-availability, and system-level consistency.