WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Commit 8eb5171

Browse files
committed
doc: website and version
- update all relevant website content for curve calibrator and robynpy - bump up version - adapt curve_type to saturation_reach_hill to account for future options - update maintainers
1 parent bcd0a13 commit 8eb5171

File tree

7 files changed

+101
-14
lines changed

7 files changed

+101
-14
lines changed

R/DESCRIPTION

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
Package: Robyn
22
Type: Package
33
Title: Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science
4-
Version: 3.11.1.9004
4+
Version: 3.12.0.9000
55
Authors@R: c(
6-
person("Gufeng", "Zhou", , "[email protected]", c("aut")),
7-
person("Bernardo", "Lares", , "[email protected]", c("cre","aut")),
8-
person("Leonel", "Sentana", , "[email protected]", c("aut")),
6+
person("Gufeng", "Zhou", , "[email protected]", c("cre", "aut")),
97
person("Igor", "Skokan", , "[email protected]", c("aut")),
8+
person("Bernardo", "Lares", , "[email protected]", c("aut")),
9+
person("Leonel", "Sentana", , "[email protected]", c("aut")),
1010
person("Meta Platforms, Inc.", role = c("cph", "fnd")))
11-
Maintainer: Bernardo Lares <laresbernardo@gmail.com>
11+
Maintainer: Gufeng Zhou <gufeng@meta.com>
1212
Description: Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation.
1313
Depends:
1414
R (>= 4.0.0)

R/R/calibration.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
#' @inheritParams robyn_run
1414
#' @param df_curve data.frame. Requires two columns named spend and response.
1515
#' Recommended sources of truth are Halo R&F or Meta conversion lift.
16-
#' @param curve_type Character. Currently only allows "saturation_reach"
16+
#' @param curve_type Character. Currently only allows "saturation_reach_hill"
1717
#' and only supports Hill function.
1818
#' @param force_shape Character. Allows c("c", "s") with default NULL that's no
1919
#' shape forcing. It's recommended for offline media to have "c" shape, while
@@ -45,7 +45,7 @@
4545
#' # Using reach saturation from Halo as proxy
4646
#' curve_out <- robyn_calibrate(
4747
#' df_curve = df_curve_reach_freq,
48-
#' curve_type = "saturation_reach"
48+
#' curve_type = "saturation_reach_hill"
4949
#' )
5050
#' # For the simulated reach and frequency dataset, it's recommended to use
5151
#' # "reach 1+" for gamma lower bound and "reach 10+" for gamma upper bound
@@ -77,7 +77,7 @@ robyn_calibrate <- function(
7777
# hp_bounds format
7878
# hp_interval
7979

80-
if (curve_type == "saturation_reach") {
80+
if (curve_type == "saturation_reach_hill") {
8181
curve_collect <- list()
8282
for (i in unique(df_curve$freq_bucket)) {
8383
message(">>> Fitting ", i)

R/man/robyn_calibrate.Rd

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

website/docs/features.mdx

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -333,7 +333,7 @@ As depicted in plot 4 in [session model onepager](#model-onepager) below, the k-
333333

334334
---
335335

336-
## Calibration with causal experiments
336+
## Calibration of average effect size with causal experiments
337337

338338
Randomised controlled trial (RCT) is an established academic gold standard to infer causality in science. By applying results from RCT in ads measurement that are considered ground truth, you can introduce causality into your marketing mix models. Robyn implements the MMM calibration as an objective function in the multi-objective optimization by parameterizing the difference between causal results and predicted media contribution.
339339

@@ -363,6 +363,55 @@ There're two major types of experiements in ads measurement, as pointed out by t
363363

364364
Robyn accepts a dataframe as calibration input in the `robyn_inputs()` function. The function usage can be found in the [demo](https://github.com/facebookexperimental/Robyn/blob/main/demo/demo.R#L262).
365365

366+
---
367+
368+
## Holistic calibration
369+
370+
### Rethinking calibration
371+
372+
The triangulation of MMM, experiments and attribution is the centerpiece of [modern measurement](https://www.facebook.com/business/news/advanced-measurement-strategy). While there's no universally accepted definition of calibration, it often refers to the adjustment of estimated impact of media between different measurement solutions. In MMM, calibration usually refers to adjusting the average effect size of a certain channel by causal experiments, as explained in details above. However, the average effect size, or the beta coefficient in a regression model, is not the only estimate in an MMM system. The bare minimum of a set of estimates in MMM includes the **average effect size, adstock and saturation**. And just as the effect size, adstock and saturation are uncertain parametric quantities that can and should be calibrated by ground truth whenever possible. We believe that **holistic calibration** is the next step of triangulation and integrated marketing measurement system.
373+
374+
<img alt="Reach & frequency curve calibration" src={useBaseUrl('img/curve_calibrator.png')} />
375+
376+
### The curve calibrator (beta)
377+
378+
Robyn is releasing a new feature **"the curve calibrator"** `robyn_calibrate()` as a step towards holistic calibration. The first use case is to calibrate the response saturation curve using cumulative reach and frequency data as input. This type of data is usually available as siloed media reports for most offline and online channels. The latest choice of reach and frequency data is **[Project Halo](https://wfanet.org/leadership/cross-media-measurement)**, an industry-wide collaboration to improve cross-channel reach deduplication. The above graphic is derived and simulated based on a real Halo dataset with cumulative spend and cumulative reach by frequency buckets. For example, "reach 3+" means reaching 3 impressions on average per person. There's certainly a gap between saturation of reach and business outcome (purchase, sales etc.). However, they're also interconnected along the same conversion funnel (upper funnel -> lower funnel), while reach & frequency saturaion curve is often more available. Therefore, we're exploring the potential of using reach & frequency to guide response saturation estimation.
379+
380+
According to [a recent paper](https://arxiv.org/abs/2408.07678) from the Wharton School and the London Business School by Dew, Padilla and Shchetkina, an common MMM cannot reliably identify saturation parameters, quote _"as practitioners attempt to capture increasingly complex effects in MMMs, like nonlinearities and dynamics, our results suggest caution is warranted: the simple data used for building such models often cannot uniquely identify such complexity."_ In other words, saturation should be caibrated by ground truth whenever possible.
381+
382+
**Our approach for saturation calibration by reach & frequency**: Assuming an extreme situation where every user sees the first impression and purchases immediately, In such a case, response saturation curve equals the reach 1+ saturation curve. [Hill function](https://facebookexperimental.github.io/Robyn/docs/features#saturation) is a popular choice for saturation transformation and implemented in Robyn, where gamma controls the inflexion point. A lower gamma means earlier and faster saturation at a lower spend level. We believe that the cumulative reach 1+ curve represents the earliest inflexion, thus it serves as a reasonable lower boundary for gamma for a selected channel. As frequency increases, the inflexion point delays and approaches the hidden true response curve. In the dummy dataset, we've simulated reach 10+ to represent the upper bound for gamma. The "best converting frequency" varies strongly across verticals. We believe that reach & frequency it's one step closer to identifying the true saturation relationship. Use domain expertise to further narrow down or widen the bounds. For alpha, we recommend to keeping the value flexible as in default.
383+
384+
The below graphic is an exemplary visualisation of a curve fitting process, where Nevergrad is used to estimate alpha, gamma as well as the beta. Note that the distribution of alpha and gamma are often multimodal and non-normal, because they rather reflect the hyperparameter optimization path of Nevergrad than their underlying distribution.
385+
386+
<img alt="Reach & frequency curve fitting" src={useBaseUrl('img/curve_calibrator_onepager.png')} />
387+
388+
To try out `robyn_calibrate()`, please see [this tutorial](https://github.com/facebookexperimental/Robyn/blob/main/demo/demo.R#L200) in the demo.
389+
390+
```
391+
library(Robyn)
392+
data("df_curve_reach_freq")
393+
394+
# Using reach saturation as proxy
395+
curve_out <- robyn_calibrate(
396+
df_curve = df_curve_reach_freq,
397+
curve_type = "saturation_reach_hill"
398+
)
399+
# For the simulated reach and frequency dataset, it's recommended to use
400+
# "reach 1+" for gamma lower bound and "reach 10+" for gamma upper bound
401+
facebook_I_gammas <- c(
402+
curve_out[["curve_collect"]][["reach 1+"]][["hill"]][["gamma_best"]],
403+
curve_out[["curve_collect"]][["reach 10+"]][["hill"]][["gamma_best"]])
404+
print(facebook_I_gammas)
405+
406+
```
407+
408+
### Customizable for 3rd-party MMM
409+
While the curve calibrator is released within the Robyn package, it can be used standalone without having built a model in Robyn. The current beta version is piloting the two-parametric Hill function for saturaion. Any MMM solution, not only Robyn, that employs the two-parametric Hill function can be callibrated by the curve calibrator.
410+
411+
In the future, we're planning to partner with our community, advertisers, agencies and measurement vendors to further explore this area and also to expand the curve calibrator to cover other popular nonlinear functions for saturation (e.g. exponential, arctan or power function) as well as adstock (geometric or weibull function).
412+
413+
414+
366415
---
367416
## Model onepager
368417

website/docs/robyn-api.mdx

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,53 @@
11
---
22
id: robyn-api
3-
title: Robyn API for Python
3+
title: Robyn Python (Beta)
44
---
55

66
import useBaseUrl from '@docusaurus/useBaseUrl';
77

8-
Enabling Robyn for Python has been a long-standing ask from the community. Robyn has started as an experimental R package. While we understand the needs of the users, it's difficult to maintain a natively translated Python package during active development on the R-side.
8+
## Alternative 1: Quick start for RobynPy (Beta)
9+
10+
The Python version of Robyn is rewritten from Robyn's R package version 3.11.1 to Python using object oriented programming principles and modular architecture for a robust solution. It was developed by utilizing various LLMs and AI workflows like Llama. As is common with any AI-based solutions, there may be potential challenges in translating code from one language to another. In this case, we anticipate that there could be some issues in the translation from R to Python. However, we believe in the power of community collaboration and open-source contribution. Therefore, we are opening this project to the community to participate and contribute. Together, we can address and resolve any issues that may arise, enhancing the functionality and efficiency of the Python version of Robyn. We look forward to your contributions and to the continuous improvement of this project.
11+
12+
13+
### 1. Installing the package
14+
15+
Install the latest Robyn Python package version from pypi
16+
```
17+
pip3 install robynpy
18+
```
19+
20+
Install from Github using requirements.txt
21+
```
22+
pip3 install -r requirements.txt
23+
```
24+
### 2. Getting started
25+
26+
The directory `python/src/robyn/tutorials` contains tutorials for most common scenarios. Tutorials use simulated dataset provided in the package.
27+
28+
### 3. Running end-to-end
29+
30+
There are two ways of running Python Robyn.
31+
32+
**Option 1:**
33+
34+
tutorial1.ipynb is the main notebook that runs the end-to-end flow. It is designed for majority of the users who would prefer a one click solution that runs the robyn flow end-to-end with minimal knowledge of the underlying logic. It should run without any changes required if you wish to use the simulated dataset for testing purposes.
35+
36+
This notebook uses APIs available in python/src/robyn/robyn.py to set the configs, run feature engineering, run model training, evaluate models with clustering, generate one pagers and perform budget allocation.
37+
38+
Change any of the configs directly in the notebook and avoid changes to robyn.py for what can be configurable.
39+
40+
**Option 2:**
41+
42+
tutorial1_src.ipynb runs the end-to-end flow of robyn python but with a lot more flexibility. It is designed for users who would like to have more control over which modules are and aren't run (ie. skipping clustering/one pager plots/budget allocation etc.). It should run without any changes required if you wish to use the simulated dataset for testing purposes.
43+
44+
This notebook doesn't use APIs available in python/src/robyn/robyn.py but instead, calls the modules directly with the appropriate parameters. In this way, it is more flexible but still expects the users to understand the underlying logic that may change when using various parameter values.
45+
46+
## Alternative 2: The Python wrapper
947

1048
The idea of a plumber API for Python is originally proposed by [Alex Rowley](https://www.facebook.com/groups/robynmmm/posts/1493036524797809/) from the Robyn community in August 2023. The Robyn team has assessed the proposal that is not only a great work-around for Python users to start with Robyn, but it actually allows API calls from any languages. We're very grateful for the collective wisdom of the open source community.
1149

1250
#### Robyn API for Python beta release
13-
The first version of the API is released on Nov.22nd 2023 on the [Meta OST summit](https://metaostsummit23.splashthat.com/?fbclid=IwAR1SRBTZGw0GIoaF0XJq_eCWFZsZbyK0KP7P4RLKoee1IVbs8H56so3giwg). This [Jupyter notebook](https://github.com/facebookexperimental/Robyn/blob/main/robyn_api/robyn_python_notebook.ipynb) shows how to call the API from Python. In the beta version, the user needs to have the Robyn R package successfully installed first. We'll work on the migitation of installation friction in the future.
51+
The first version of the API is released on Nov.22nd 2023 on the [Meta OST summit](https://metaostsummit23.splashthat.com/?fbclid=IwAR1SRBTZGw0GIoaF0XJq_eCWFZsZbyK0KP7P4RLKoee1IVbs8H56so3giwg). This [Jupyter notebook](https://github.com/facebookexperimental/Robyn/blob/main/robyn_api/robyn_python_notebook.ipynb) shows how to call the API from Python. In the beta version, the user needs to have the Robyn R package successfully installed first. We'll work on the migitation of installation friction in the future.
1452

1553
<img alt="Robyn API for Python Architecture" src={useBaseUrl('/img/robyn_api_architecture.png')} />
138 KB
Loading
112 KB
Loading

0 commit comments

Comments
 (0)