`Stickleback.Rd`

Define a Stickleback model, used for automated detection of behavioral events in bio-logging data.

`Stickleback(tsc, win_size, tol, nth = 1, n_folds = 4, seed = NULL)`

- tsc
`[py:sktime.base.BaseEstimator]`

A time series classifier created with either`compose_tsc`

or`create_tsc`

.- win_size
`[integer(1)]`

Sliding window size in number of observations. E.g., for 10 Hz data and a 5 s sliding window,`win_size`

should be 50.- tol
`[numeric(1)]`

Prediction tolerance, in seconds. See`sb_assess`

for details.- nth
`[integer(1)]`

Sliding window step size. For example, when`nth`

= 1, the time series classifier (`tsc`

) will make predictions on every window. When`nth`

= 2,`tsc`

predictions are only generated for every other window. Higher`nth`

values reduce the time to fit a Stickleback model and generate predictions, at the potential cost of reduced prediction accuracy.- n_folds
`[integer(1)]`

Number of folds for internal cross validation.`n_folds`

must be at least 2. Larger`n_folds`

values increase model fitting time, but may have greater out-of-sample accuracy.- seed
`[integer(1)]`

Random number seed for model reproducibility. CURRENTLY NOT WORKING (see issue #6).

There are two challenges facing automated behavioral event detection in bio-logging data. First, bio-logging data are time series and most classification algorithms have poor performance on time series. Second, bio-logging data resolution greatly exceeds the frequency of many biological rates, creating an imbalanced class problem. For example, bio-logging data collected from baleen whales is often standardized at 10 Hz, but feeding rates are approximately 200-500 events per day. Therefore, the "behavioral event" class is on the order of 1000s times smaller than the "non-event" class.

Stickleback addresses these challenges in a two-stage process. First, it uses
classification algorithms specifically designed for time series data by
interfacing with the sktime Python
package. Second, it under-samples the majority class ("non-events") when
training the classifier, then optimizes event prediction using internal
cross-validation. See `vignette(rstickleback)`

for more details.

`local_clf`

`[py:sktime.base.BaseEstimator]`

A time series classifier, inheriting from sktime's BaseEstimator.`win_size`

`[integer(1)]`

Sliding window size.`tol`

`[numeric(1)]`

Prediction tolerance, in seconds.`nth`

`[integer(1)]`

Sliding window step size.`n_folds`

`[integer(1)]`

Number of folds for global cross validation step.`seed`

`[integer(1)]`

Random number seed.`.stickleback`

`[py:Stickleback]`

Python Stickleback object.

```
# Load sample data
c(lunge_sensors, lunge_events) %<-% load_lunges()
# Define a time series classifier
tsc <- compose_tsc(module = "interval_based",
algorithm = "SupervisedTimeSeriesForest",
params = list(n_estimators = 2L, random_state = 4321L),
columns = columns(lunge_sensors))
# Define a Stickleback model
sb <- Stickleback(tsc,
win_size = 50,
tol = 5,
nth = 10,
n_folds = 4,
seed = 1234)
```