-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathsamovar-basic.Rmd
More file actions
81 lines (59 loc) · 1.67 KB
/
samovar-basic.Rmd
File metadata and controls
81 lines (59 loc) · 1.67 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
title: "Basic samovar usage"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{samovar-basic}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 10
)
library(samovaR)
library(tidyverse)
```
# Download data
Download and vizualize data from GMrepo
```{r API}
teatree <- GMrepo_type2data(number_to_process = 1500)
viz_composition(teatree, type = "tile", interactive = F, top = 10)
```
# Preprocessing
## Filter data
```{r filter}
tealeaves <- teatree %>%
teatree_trim(treshhold_species = 3,
treshhold_samples = 3,
treshhold_amount = 10^(-3))
```
## Normalizing
If you build teatree by your own, do rescaling stage when building via `teatree$rescale()` or assigning `teatree$min_value` and `teatree$max_value` is required
Good approximation to normal distribution is required for `glm` generating methods
```{r normalize}
teabag <- tealeaves %>%
tealeaves_pack()
```
## Clustering
Do hierarchical (deprecated) or aggregating clustering of species
Remember: if you want to re-filter, it is better to re-do welding stage to avoid crashes in future!
```{r cluster}
concotion <- teabag %>%
teabag_brew(min_cluster_size = 40, max_cluster_size = 150)
```
## Build samovar
```{r build}
data.samovar <- concotion %>%
concotion_pour(probability_calculation = "simple")
```
## Generate data
```{r generate}
new_data <- data.samovar %>%
samovar_boil(N = 100, avoid_zero_generations = T)
```
The generated data:
```{r generate_viz}
viz_composition(new_data, reord_samples = "hcl",type = "tile", interactive = F, top = 10)
```