A tool for anyone that needs to import and work with data
data inventory
Track your data sources
It's amazing how many different data sets come our way. Tracking
where they are, the scope and whether they include the latest can
be a time consuming process.
Take inventory of the data anywhere in your drive, shared-drive,
corporate data centers and the cloud. A permanent inventory of
data sources is built-in to the process.
The source data is never changed ensuring a complete record of
your process from start to finish.
align
Build domain-specific data schemas faster than ever before
Who knew that naming fields could be so time consuming and prone
to error. The process reduces the complexity of this often
tedious and time consuming task through a combination of approaches.
The process leverages the inherent qualities of the data and the
information therein to inform and reduce the burden of tracking the
otherwise fragmented single field perspective.
- Single purpose user-interface ensures a layered view that prioritizes if and when additional attention is required
- Instant, on-demand context and field-specific summary views
- Relate fields from different sources in less than a minute.
Interprets and validates how all of the data "fits-together", instantly. Provides ongoing, real-time feedback and next-steps.
- Identifies gaps or inconsistencies
- Provides detailed instructions on how to fix the issue that can often be fixed "on the spot"
- Offers automated solutions to make changes across data sources
The result is a fully automated and reproducible process. The end-result is palpable: confidence, consistent treatment of related fields.
stacking
Combine multiple sources of truth quickly - confidently
No more need to generate and manage scripts with how to combine the data sets. Easily specify and change how data is being combined.
- Changes are made with a view of the complete data set
- Dedicated processes for validating and propagating user-input throughout the workflow
- The workflow anticipates, and is built to facilitate a consistent and informative naming convention
The workflow for accomplishing the task is optimized in several ways.
- Common-sense context-sensitive default strategy
- Visual review and capacity to adjust that is baked-in to the process
- Easy drag and drop capacity to adjust as needed
- Iterate and adjust the documentated approach anytime
- Dowstream capacity to assert and confirm your intent
create new
Generate new fields leveraging best-in class methodologies
Never again rely on stale computations to avoid introducing
inconsistencies.
Pull from a range of well-known computations (e.g., market share,
customer decile)... and less-well-known but really useful if it
weren't so complex before now.
Create your own integrated computations for you and your
organization to leverage, anytime.
Avoid introducing differences that are external to the subject
universe. Specify the subject universe once. All of the computed
fields will be computed consistently using that single universe.
Create a time-series, or perform any number of data
transformations with one end in mind: a data set structured for
statistical analysis.
- a single record for each subject
- derived fields computed using a single subject universe
compare, compare, compare
Make fair and accurate comparisons - reproducibly
It's one thing to compare two groups, it's a whole other thing to compare two groups for purposes of statistical comparison. Build test and control groups leveraging the industry gold-standard.
- Compute a propensity score to create a fair and tight comparision between groups
- Specify "micro-pools" of control subjects to help track the impact ("lift") from an ongoing campaign
- Expand the control-pool by dynamically re-using controls at different relative points in time
Visualizing the data is core to ensuring a valid comparison. However, the task can be made near impossible when the "main-effect" occurs at different points in time for a given subject. No more. The platform allows for absolute and relative time; either-way, the derived fields for the test and control groups will be correct and always computed based on the latest iteration.
always current
Built for continuous data updates
When you get updated data, no problem. The fully-automated process makes it easy to integrate and iterate with confidence. Integrate a new dataset that might explain some of the variance (i.e., something that might reduce the noise to help see more of the signal).
- The system will instantly identify what interventions are required
- Recompute the size and shape of the subject universe
- Recompute the derived fields
- Rebuild the test and control groups
- Run and present the statistical findings
infinite possibilities
Unprecedented opportunity for collaboration
The application is built for collaboration making it easy to share, and iterate on both the design and details of the data.
- Secure, client-server technology
- Capacity to share the design within your team
- Single-sign-on capacity for where the data is; Microsoft and Google
Review and share your approach with a colleague. The configuration
is ready and available anytime.
Leverage the methodology for the next study anytime.
Control what is shared when. Managing data access and
licensing can all be controlled for this single point to control.
In the event there is even more you want to accomplish, or need to
use the data to feed another data pipeline, we have you
covered. Once you have structured your data, the data
is stored in a format that can be used anywhere.
- csv
- excel
- pandas dataframe
- polars dataframe
- sqlite
Get a headstart on your analysis.
Try it now for free.
- Get going with your analysis
- Create the perfect data set in minutes
- Enjoy early-access with free, live 1:1 support