Research & Resources

Contact Tracking Visualization – Business Intelligence & Analytics
July 5, 2020

As more and more experts have rallied around limiting contacts as an effective strategy to manage epidemics[i] , the role of analytics and analytical models for epidemic tracking and management has acquired a great deal of significance.

The recent progress of digital technology resulted in creation of huge volumes of spatial datasets. This led to the rise of the location tracking systems that can offer details in different application scenarios, such as the spread of the actual Covid-19. Many governments worldwide, concerned with the number of coronavirus cases being reported, are trying to investigate and apply some kind of geospatial data analytics.

Analyzing locations of the smartphone owners’ (via FB, Google) can be used as a powerful tool for health authorities looking to track coronavirus. Also, different phone sensors (GPS, Bluetooth) can provide signaling data with person’s location information, in order to track users trajectories. The data analytics tools can be applied to analyze these spatial data sets via real-time dashboards that offer graphical interpretations with detailed visualizations of disease patterns.

We at Visvero, also believe that the use of data analytics solutions to track and monitor users locations / disease spread – can help control the disease efficiently.

Mechanics of Contact Tracking applications

While the approach has been different across the world, the applications follow the same general method. As an example, Government of India launched an application Aarogya Setu (Literally, Bridge to cure of disease).

The app leverages Bluetooth and GPS-based location tracking to identify the possible positive coronavirus cases around users. It detects other devices with the Aarogya Setu app installed and alerts users based on proximity to the device. It also captures this information to let authorities know about the movement of suspect cases. The recommendations are made leveraging Bluetooth technology, artificial intelligence algorithms and is based on inputs and best practices suggested by expert medical practitioners and epidemiologists. The government says that the information will be used to reach the user, in case medical intervention is needed.

If the person is positive for coronavirus, the app calculates the risk of user’s infection based on recency and proximity of their interaction and recommends suitable action. The app also has a self-assessment test, that captures the user’s current vulnerability to Covid-19 infection and provides contextual advice.

 

Also, users can upload their Google timeline history to a website, where an analytics algorithm would generate a list of locations where they may have been exposed to a Covid-19 case in the recent past. The information would be then displayed on a color-coded map with each visited location, ranked as either high, moderate, or low risk of exposure. This feature would allow users to determine if they were exposed to a positive tested individual.

 

Data analytics can bring insights via Interactive visualizations and data dashboards

Figure 1: Bing.com/Covid tracker

From simple trend analysis models that are allowing users to compare the infections and fatalities across the world (Fig. 1), to more complex mathematical modeling of epidemics like the SIER Model, epidemic tracking algorithms and practitioners have come up with really interesting methodologies to track and manage the contact data.

The presentation of trends and real-time events on data dashboards, will improve the analysis and enable fast action. Geospatial data display will help authorities to get location-based insights and analyze factors leading to disease spread.

BI and analytics tools can offer different ways to visualize the virus spread and individual locations. Such visualization option could provide a colored map of all locations visited by individuals known to have Covid-19 during the last 48 hours. The visualization would use a function that combines the number of Covid-19 cases to visit the location, as well as how recently they did so – before color-coding it.

 

Also, by using a slide bar to control the time frame, users can see the number of Covid-19 cases over time, including casualties. This will enable health officials to visualize how the virus has progressed over time not just in their own countries, but also at regional (county) level.

Animated charts like the ones popularized by Professor Hans Rosling of Gapminder Foundation[ii] allow us to track the measures over time.

Click here to see some innovative Gapminder data visualizations.

 

Figure 2 – Multi-Level Sankey Template Tableau Viz Author: Ken Flerlage

Sankey Charts

These are easy to read visualizations[iii] of tracking contacts and the spread of the infection. Additional data and data filtering e.g color coded contact type, positive/negative test results will further enhance the utility and readability of the virus spread patterns.

Customized solutions and location-specific insights to control the situation

Location analytics in combination with geo-spatial big data analytics can help healthcare officials understand why certain procedures work in one region and fail in another. It can also help government bodies to understand various aspects of the outbreak by monitoring and tracking it in real-time.

For example, an individual would decide to avoid particular store in his area that has had a large percent of Covid-19 positive people in the recent past, and instead shop somewhere with a lower rate. An example can be, e.g. for New York City, a person could use this information to determine if it would currently be safe to go for a walk or jog in Central Park, in comparison to a Bronx neighborhood.

[iv]Enhanced geo-coded  data allows users to visually engage with the data and make appropriate decisions. (Click on the map to see current data from New York Times)

In Spain,  Asistencia-Covid19, which launched first as a pilot in the Community of Madrid (and has now been rolled out by other autonomous communities in Spain), features a questionnaire based on which users can check whether the symptoms they are experiencing – are similar with typical COVID-19 symptoms. Based on this information, the ‘app’ provides recommendations regarding the need to isolate or contact health services. It also allows users to track how their symptoms evolve.

Optionally, users can also accept to share their device’s location data with the app, “with the purpose of guaranteeing the quality of the data and its epidemiological analysis,” as they explain on the app’s website. What the app does require users to do – is to fill out a form with personal info, including a contact address where health authorities can reach the person if necessary. This ‘app’ has some similarities with Stop Covid19 Cat, offered in Catalonia. However, the latter requires users to consent to share their geolocation data, which allows authorities to gather information on how the pandemic is spreading throughout the region.

In short, the more sensitive the data is from a privacy protection standpoint, the more useful it is from an epidemiological point of view. This implies that citizens may be faced with having to choose between anonymity and convenience, which is something we all have to do in any case on a daily basis when using digital services. Still, there are cultural factors that set apart the choices that individuals in the East and West might make: faced with this dilemma, in the East, the common good prevails over individual rights.

Conclusion

The role of big data and artificial intelligence is very important in trying to limit the spread of the disease. Aside from the geo-location and mobility information, the millions of smartphones globally can give both historical collection of data and real-time (or almost real-time) insights for analysis and more accurate predictions. Advanced BI algorithms and computational models can be applied to analyze all these mega-sets of gathered data in order to answers the questions, predict different scenarios under given conditions, and come up with best recommendations and instructions.

To sum-up – potential BI / analytics application scenarios can include, but are not limited to:

  • Providing accurate and detailed time-based historical location information of infected individuals and close circles (potential infections) to establish chains of transmission.
  • Rapid assessment of probability of exposure in a given area (or cluster) by cross-matching of smartphone locations of affected and suspected individuals (GPS, or Bluetooth sensors).
  • Adding video / CCTV records to identify people in particular infection hotspots.
  • Data visualization that offers a lay person view of the pandemic dataset – making decision making much simpler.

[i] Why outbreaks like coronavirus spread exponentially, and how to “flatten the curve” By Harry Stevens Washington Post  March 14, 2020

[ii] www.gapminder.com

[iii] Viz Author: Ken Flerlage – Tableau Public Cloud

[iv] Coronavirus in the U.S.: Latest Map and Case Count By The New York Times Updated June 30, 2020,


Meenakshinathan (Nathan) Padmanabhan is a Practice Lead for Business Intelligence at Visvero. He’s been supporting several clients, across Financial Services, Travel & Hospitality, Utilities, Retail etc. verticals, in helping build effective strategies to build effective data engagement platforms and dashboard applications. Nathan has been leading a team of subject matter and technical experts on a variety of technologies from Microsoft, Qlik, Tableau, Informatica, Hadoop, Azure and AWS platforms.

READ MORE
Digital Transformation : Conagra Case Study
July 23, 2019

CIO Helps Conagra Turn Food Trends Into Products

A case for how  Conagra’s CIO, Mindy Simon, leveraged the AI platform to use technology to help drive growth, including the launches of new products under the company’s existing brand names. The platform sources data from the likes of Facebook, Google, Instagram, Pinterest as well as from various Market Research companies, with frequent updates. Pulling Data is automated, as is prepping it for analytics applications.

The tool lets the business identify “pockets of growth” that might have otherwise gone unnoticed. This is largely because the platform can tie data together in one place. Click on the link below to read about this project.

Conagra AI application ( 2 pages / 850KB)CIO

 

Copyright (c) 2019 Dow Jones and Company, Inc. CIO Journal 07/22/2019

READ MORE
Business Intelligence in the age of Analytics and Artificial Intelligence
July 8, 2019

ADVANCE TO INTELLIGENT DECISIONMAKING

Jonathan Bach & Arvind Handuu

LET US START WITH GETTING THE BASICS OUT OF THE WAY

– DATA IS NOT THE NEW OIL, IMPORTANT, YES BUT BUSINESS IS ABOUT DOING BASICS RIGHT,

– ALL BUSINESSES ARE NOT INFORMATION BUSINESSES,

– ARTIFICIAL INTELLIGENCE IS NOT INTELLIGENCE, JUST BEST GUESS

– DATA BY ITSELF AND DEVOID OF THE CONTEXT IS WORTHLESS

These days, one cannot pick any trade publication and not notice a convergence on dia del sabor, namely Analytics and its ‘on-steroids version’ Artificial Intelligence (AI). The collective wisdom points to the efficacy of data utilization to transform the business. Many of our clients have gone through this stage of data envy. Some continue to allocate disproportionate scarce resource of time and treasure to “monetize their data”.

Like all gold rush stories, this one is fraught with peril. The primary reason appears to be the significance attached to the technology as opposed to the business objectives. We’ve seen this before with corporations jumping from one tool to another in hopes that the next tool will be the one to save the initiative, company or the project. The problem is in training, or lack thereof, of the training in the organizations to ask the right question. In this, the question becomes the answer.

The goal, it appears, is to replace or at least augment an organizations intuition and experience-based decision making with an element of data centricity to avoid missteps and to shine a spotlight on certain blind spots. The data centricity assumes a level of Data availability and education or without this any further progress is impossible. We propose a following step approach to help clients adequately optimize the operations and leverage data assets where possible and necessary.

Automate and Optimize – most businesses have implemented some level of automation, some more than others, the starting point is that. Before one begins any large-scale transformation, challenge what is already implemented. Even if it is nothing more than a general ledger at this time. This is a foundational step to transforming a company. The key watchword in this stage is flexible and adaptable. Know that what we build here will need to be changed and rebuilt. The automation platform should be flexible and adaptable.

Digital from the outset – This is an appropriate analytical step in the organization’s life cycle, as the model should be to think innovatively about key aspects of the business. The approach here is to develop transformative competencies that have a potential to be disruptive, think Uber to Taxis, Airbnb to hotels and also operations transparency. In this stage the company starts to build data competencies.

Value creation – with the core of data collection in place, the organization may launch the build stage of analytics framework. The organization decides to make key data available to executives for informing tactical decisions and strategic choices. The key success factors in this process are:

Analytics Evangelist – an organization needs to allocate a role of developing data insights and educating line managers on the possibilities of data analysis

Speed to deliver – Time kills energy, this is true in physics as it is in companies. The organization should have the necessary strategy in place to make the results of analytics available to the managers within on a few days of demand

Keep it simple – an organization must simplify the learning and deployment of analytics by reducing rework. Its often helpful to choose technology that offer a consolidated BI and analytics approach.

Adopt and adapt – Reduce human variability – The weakness of the data driven decision-making is often in ignoring the human factors in adoption:

Education- Implement programs to drive overall data competency in an organization

Build Momentum through ease of access, make it easier for executives to get analytical answers

Implement and measure the use of analytical elements in the corporate choice selection

Reward the utilization of analytical tools

New applications – Encourage the use of analysis, by promoting new applications and new analysis. Share and promote.

Critique – Set periodic review points to analyze the decisions made and supported by hard measured data as well as decisions made without the adequate data. Compare outcomes.

Educate and Evolve – Business Intelligence and Analytics is a high ROI and can be implemented very cost effectively. The organization needs to ensure a steady process of training the users about new data and analysis being made available and effective strategies for utilization

In the end, Data can be a very effective tool available to the decisionmakers in an organization, the power of data becomes available to the organization as more investments are made in development of human capital to ask the right questions of the business. Features in modern BI and Analytical Tools can help users overcome delays and difficulties by automating aspects of data exploration and analytics development and delivering information answers and recommendations to users in the context in which they need it.


Jonathan Bach is a Client Solutions Partner at Visvero, Inc. He works closely with various F2000 clients in optimizing the use of professional services for “ADVANCE”; ing along the BI/ Analytics path. Jon is based out of the Visvero, Pittsburgh.

 

Arvind Handuu is a Practice Manager for Business Intelligence & Analytics at Visvero. Arvind is an analytics value purist. He believes that a BI & Analytics platform should be a self-contained and sovereign solution. “Its value drops to zero the instant you are using a different data source to inform your decision.”; Arvind is based out of Visvero, Pittsburgh.

READ MORE
Data Virtualization
April 8, 2019

DATA VIRTUALIZATION – WHAT’S REAL?

Meenakshinathan Padmanabhan & Arvind Handuu

Unless you’ve been living under a rock for the past few years OR have
chosen to not look at a printed word, you’d have been told, often enough
that you are physically fatigued, data is what you produce and how that
(data) is the new oil. The real cure for all that ails us. How, once we
organize it and make it get more well-rounded experience and learning, the
world would be a better place. My cat is now afraid of Data, it’s data
world and not the cat’s, we all including the cat just rent it.

To be fair though, the impact of Data – the availability, regeneration,
amount, age, lineage, utility, diversity, value, reach- though at times
overstated is significant enough that organizations are well served by
continually scanning the environment to look for opportunities that make
value creation possible.

In a recent Forbes Survey’ 2018 Data Virtualization was in the top 3
highest growth areas for the year 2018 / 2019.

This post is an attempt to simplify the conceptual presentation on Data
Virtualization. So then let’s take it from the top.


So, what is Data Virtualization? what is a good use case for this? and
what are the corner cases where this approach fails?

Data Management, especially in the applications that require post-fact data
creation and analysis of institutional data assets, is an ever-moving
target. It appears that just as a seemingly effective governance model is
implemented and initial set of questions are getting answered, new
questions arise often requiring new information and data sources to be
incorporated in the answer base. This requires reworking the data
warehouse, introducing the new data source, applying the same rigor to
ensure data hygiene. And an expensive build, add, analyze cycle repeats.
Data Virtualization offers a near-term reprieve from this solution by
making it easy to introduce new data sources rather quickly. This solution
has a potential of being THE solution in case of less complex data
environments and at least a resilient intermediate solution in applications
with higher data complexity.


Data virtualization is the process of offering data consumers a data
access interface that hides the technical aspects of stored data, such
as location, storage structure, API, access language, and storage
technology.

Data virtualization creates integrated views of data drawn from disparate
sources, locations, and formats, without replicating the data, and delivers
these views, in real time, to multiple applications and users. Data
virtualization is any approach to data management that allows an
application to retrieve and manipulate data without requiring technical
details about the data, such as how it is formatted at source, or where it
is physically located, and can provide a single customer view (or single
view of any other entity) of the overall data. Data virtualization can draw
from a wide variety of structured, semi-structured, and unstructured
sources, and can deliver to a wide variety of consumers. Because no
replication is involved, the data virtualization layer contains no source
data; it contains only the metadata required to access each of the
applicable sources, as well as any global instructions that the
organization may want to implement, such as security or governance
controls. This concept and software is a subset of data integration and is
commonly used within business intelligence, service-oriented architecture
data services, cloud computing, enterprise search, and master data
management.

The concept was initially incorporated in various business intelligence
tools like @Qlik, @Spotfire, @Tableau to name a few. The obvious limitation
being the close coupling between the virtual data store and the choice of
analytical (at the time this was mainly data visualization) tools. That
meant that the limitations of the analytical tools defined the extent to
which data could be utilized. The below graphic represents the data
virtualization approach by one of the leading solutions vendors in this
technology, Denodo.

Image Courtesy: Denodo

Our teams have taken a position that in case of very small data base
volumes and relatively clean data sources, data virtualization would be an
effective solution that would allow a federated data structure and quick
analytics solution. However, as the data complexity increases the
organizations will need a more disciplined data governance practices
effected in the data warehouse led analytics platform. In such cases a
virtualized database solution would be utilized as a rapid Proof of Concept
solution to test various source systems.

We find data virtualization highly effective in the following use cases:

‣ Generally structured data sources with easy to define relationships.
Referring to the promise stated earlier in this article, Data
Virtualization really does deliver on the data integration front. Whether
one needs data from a mobile application or from hundreds of domains and
other web technologies, Data Virtualization consolidates all of that into a
single solution.

‣ Data virtualization supports the integration of structured and
semi-structured data, and is seamlessly supported by the likes of Hadoop
and MapReduce.

‣ Rapid analytics delivery OR short-term proof of concept solutions. Unlike
some massive Data Management solutions, Data Virtualization can be
implemented at an unnervingly rapid rate. It can be implemented into
already existing infrastructure in a matter of weeks and months. Some Data
Virtualization adopters have reported an ROI turnaround of less than six
months.

‣ Direct exposure into the source applications, the reason for data
virtualization is the ability in incorporate operations data in real time.

While the above might appear compelling, data virtualization falls short in
the following key application areas:

‣ Historical and lineage tracking applications e.g. Slowly Changing
Dimension Type I/ Type II problem areas. Organizations need to use data
warehouses when there exists a need to analyze data that is days, weeks or
even months old. Data warehouses are a better option for an organization in
this case.

‣ Data Virtualization often imposes a great deal of stress on the
organization’s operations, often requiring massive overhead. These changes
need to be integrated and distributed throughout every user and application
within your entire infrastructure. This can be a huge financial and
logistical strain on your environment.

‣ Overall effectiveness, data virtualization solutions can be deceptively
difficult. The data virtualization solutions’ effectiveness in managing
real-time data delivery can be a little underwhelming. The expectation gap
usually occurs when an organization thinks that just because they’re using
a powerful Data Virtualization solution that they no longer have to manage
their own data.

In the Data Management space there are very few, if any, magic bullets.
Data Virtualization is an effective Swiss Army knife in a data architect /
Solution strategist’s toolkit. While data virtualization is far from
perfect now the overall market is evolving at a rapid rate to provide
access to real-time, easily managed data. But as a sole mode of capturing,
interpreting, and managing BI data, the virtualized data warehouse is an
effective strategy to create business value and introduce additional data
sources in the analytics framework.



Meenakshinathan (Nathan) Padmanabhan is a Sr. Data Solutions
Architect at Visvero, Inc. He has been supporting various F2000
clients in deploying effective data management, Business
Intelligence and Analytics solutions for over 20 years. Nathan is
based out of the Visvero, Pittsburgh.

 


Arvind Handuu is a Practice Manager for Business Intelligence &
Analytics at Visvero. Arvind is an analytics value purist. He
believes that a BI & Analytics platform should be a
self-contained and sovereign solution. “Its value drops to zero the
instant you are using a different data source to inform your
decision.” Arvind is based out of Visvero, Pittsburgh.

READ MORE