Helper functions
OTRecod.disp_inst_info
— Methoddisp_inst_info(inst)
Display information about the distance between the modalities
OTRecod.compute_distrib_error!
— Methodcompute_distrib_error!(sol, inst, empiricalZA, empiricalYB)
Compute errors in the conditional distributions of a solution
OTRecod.compute_distrib_error_3covar
— Methodcompute_distrib_error_3covar(
sol,
inst,
empiricalZA,
empiricalYB
)
OTRecod.compute_pred_error!
— Functioncompute_pred_error!(sol, inst)
compute_pred_error!(sol, inst, proba_disp)
compute_pred_error!(sol, inst, proba_disp, mis_disp)
compute_pred_error!(
sol,
inst,
proba_disp,
mis_disp,
full_disp
)
Compute prediction errors in a solution
OTRecod.aggregate_per_covar_mixed
— Functionaggregate_per_covar_mixed(inst)
aggregate_per_covar_mixed(inst, norme)
aggregate_per_covar_mixed(inst, norme, aggregate_tol)
OTRecod.empirical_distribution
— Functionempirical_distribution(inst)
empirical_distribution(inst, norme)
empirical_distribution(inst, norme, aggregate_tol)
Return the empirical cardinality of the joint occurrences of (C=x,Y=mA,Z=mB) in both bases
OTRecod.average_distance_to_closest
— Methodaverage_distance_to_closest(inst, percent_closest)
Compute the cost between pairs of outcomes as the average distance between covariations of individuals with these outcomes, but considering only the percent closest neighbors
OTRecod.avg_distance_closest
— Methodavg_distance_closest(
inst,
base1,
base2,
outcome,
m1,
m2,
percent_closest
)
Compute the average distance between individuals of base1 with modality m1 for outcome and individuals of base2 with modality m2 for outcome
Consider only the percent_closest individuals in the computation of the distance
OTRecod.empirical_estimator
— Functionempirical_estimator(path)
empirical_estimator(path, observed)
Get an empirical estimator of the distribution of Z conditional to Y and X on base A and reciprocally on base B obtain with a specific type of data sets
path
: path of the directory containing the data setobserved
: if nonempty, list of indices of the observed covariates; this allows to exclude some latent variables.
OTRecod.simulate
— Functionsimulate()
simulate(R2)
simulate(R2, muA)
simulate(R2, muA, muB)
simulate(R2, muA, muB, alphaA)
simulate(R2, muA, muB, alphaA, alphaB)
simulate(R2, muA, muB, alphaA, alphaB, n)
simulate(R2, muA, muB, alphaA, alphaB, n, q1)
simulate(R2, muA, muB, alphaA, alphaB, n, q1, q2)
simulate(R2, muA, muB, alphaA, alphaB, n, q1, q2, q3)
Simulate one dataset with three covariates described by their mean in each database (muA and muB) and the quantiles used for discretization (q1,q2,q3) The dependency of outcomes on covariates is linear and given by the weights alpha1, alpha2 and by the R2 coefficient The instance contains n individuals in each base
OTRecod.bound_prediction_error
— Functionbound_prediction_error(inst)
bound_prediction_error(inst, norme)
bound_prediction_error(inst, norme, aggregate_tol)
Compute a bound on the average prediction error in each base. The bound is computed as the expected prediction error assuming that the distribution of Z in base A (and that of Y in base B) is known, and the prediction done with the value that maximizes the probability
OTRecod.compute_average_error_bound
— Functioncompute_average_error_bound(path)
compute_average_error_bound(path, norme)
Compute a lower bound on the best average prediction error that one can obtain with a specific type of data sets path: path of the directory containing the data set