Assessment AlgorithmTo compare automatic and visual annotations in micro-events Contact: Stéphanie DEVUYSTUniversité de Mons Faculté Polytechnique de Mons - TCTS Lab 31, Boulevard Dolez B-7000 Mons (Belgium) ph : +32 (65) 37.47.20 fax: +32 (65) 37.47.29 stephanie.devuyst@umons.ac.be |

Introduction | Assessment method | Algorithms | Download | More |

During the DREAMS project, we developped and tested several automatic procedures to detect micro-events such as sleep spindles, K-complexes, REMS, etc.

Unfortunately, we observed that perfomances completed in the same field by other authors were hardly comparable because their methodology, their databases and their assessment methods were radically different.

To solve this problem and allow evaluation and comparison between other future works, we made our database, our visual scorings and our automatic scorings freely available on the web (Here). In addition, we proposed a unique assessment method using a well defined terminology and from which it is possible to establish all the desired confusion matrices. This assessment method is presented below and the corresponding algorithm, implemented under Matlab, can be downloaded Here.

Our assessment algorithm can take into account the visual scoring of one or two experts.

Knowing the start time and duration of the annotated micro-events (on one hand, visually and on the other hand, by the automatic algorithm), it identifies the quantity of each possible covering as illustrated on Fig. 1.
These various possible configurations are gathered in 4 categories: type T1 (A, B, or C) corresponds to a correct automatic detection
since at least one of the two experts has scored the event like such; type T2 corresponds to a false detection;
type T3 (A, B, C, or D) corresponds to a missing detection with respect to one or both experts;
and type T5 (A, B, or C) corresponds to multiple coverings implying automatic detection.

Once the number of these various types is known, it is easy to deduce the number of true positive (#TP), the number of false positive (#FP) and the number of false negative (#FN) of the different confusion matrixes (see table I). Moreover, the number of true negative (#TN) can be aproximate by (the total duration of the excerpt/the average duration of the micro-event) - (number of true positives + number of false positives + number of false negatives).

Finally, it is easy to deduce from these confusion matrices, the parameters commonly used in literature namely:

- The total agreement rate:

- The sensitivity, representing the probability that the automatic detection algorithm gives a positive result in the presence of a reference annotation:

- The specificity, representing the probability the automatic detection algorithm gives negative result in absence of any reference annotation:

- The false positive rate:

- The proportion of false detections compared to the actual number of micro-events (according to the expert):

- The amount of false alarms among all automatic detections:

Algorithms for the assessment method has been developped under Matlab.

- Function assessmentMethod.m compare the automatic and visual annotations of one type of micro-event, carried out on one exerpt.

**Inputs:**

Function assessmentMethod.m uses as inputs files whose format is the same as those proposed in our public databases :

* filename_edf : filename of the polysomnographic recording (in the European Data Format) from which the annoted channel is extracted.

* filename_Automatic_detection : filename of a textual file containing, in the first line, the name of the detection and then, 2 columns with the beginnings (first column) and durations (second column) of the automatically detected micro-events (in second).

* filename_Visual_scoring1 : filename of a textual file containing, in the first line, the name of the detection and then, 2 columns with the beginnings (first column) and durations (second column) of the visually scored micro-events by expert 1 (in second).

* filename_Visual_scoring2 : filename of a textual file containing, in the first line, the name of the detection and then, 2 columns with the beginnings (first column) and durations (second column) of the visually scored micro-events by expert 2 (in second). If this second annotation is not available, use filename_Visual_scoring2=''.

* filename_Hypnogram: filename of a textual file containing from the second line the hypnogram information, with one value per 5 sec.

- 5=wake

- 4=REM stage

- 3=sleep stage S1

- 2=sleep stage S2

- 1=sleep stage S3

- 0=sleep stage S4

- -1=sleep stage movement

- -2 or -3 =unknow sleep stage

If this Hypnogram is not available, use filename_Hypnogram=''.

*duration_event: is the average duration (in second) of the micro-event detected (for the approximation of the True Negative number #TN) in confusion matrixes.

**Outputs:**

Function assessmentMethod.m returns a structure "nbr_types" with the following fields:

* nbr_types.nbrtot_scored_by_system = total number of events scored by the automatic system, in each stages (wake,REM,S1,S2,S3,S4), and in total (=> vector with 7 values).

* nbr_types.nbrtot_scored_by_visual1 = total number of events scored by the expert 1, in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbrtot_scored_by_visual2 = total number of events scored by the expert 2, in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_sys__type2 = number of coverings scored by only system (of type T2), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_vis1__type3A = number of coverings scored by only scorer #1 (of type T 3A), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_vis2__type3B = number of coverings scored by only scorer #2 (of type T3B), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_systEvis1__type1A = number of coverings scored by only system & scorer #1(of type T1A), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_systEvis2__type1B = number of coverings scored by only system & scorer #2 (of type T1B), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_only_by_vis1Evis2__type3C = number of coverings scored by only scorer #1 & scorer #2 (of type T3C), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_scored_by_vis1Evis2Esyst__type1C = number of coverings scored by system & scorer #1 & #2 (of type T1C), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.type3D = number of coverings of type T3D, in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.type5A = number of coverings of type T5A, in each stages (wake,REM,S1,S2,S3,S4), and in total..

* nbr_types.type5B = number of coverings of type T5B, in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.type5C = number of coverings of type T5C, in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_aut_multiple = number of automatic quotation implied in a multiple covering (3D, 5A,5B or 5C), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_vis1_multiple = number of quotation of scorer 1 implied in a multiple covering (3D, 5A,5B or 5C), in each stages (wake,REM,S1,S2,S3,S4), and in total.

* nbr_types.nbr_vis2_multiple = number of quotation of scorer 2 implied in a multiple covering (3D, 5A,5B or 5C), in each stages (wake,REM,S1,S2,S3,S4), and in total.

It also returns the different possible confusion matrices:

* confusion_matrix_aut_Vis1 = confusion matrix of the automatic detection compared to the expert 1 scoring.

* confusion_matrix_aut_Vis2 = confusion matrix of the automatic detection compared to the expert 2 scoring.

* confusion_matrix_Vis1_Vis2 = confusion matrix of the expert 1 scoring compared to the expert 2 scoring.

* confusion_matrix_Vis2_Vis1 = confusion matrix of the expert 2 scoring compared to the expert 1 scoring.

* confusion_matrix_aut_Vis1uVis2= confusion matrix of the automatic detection compared to the union of the expert 1 scoring and the expert 2 scoring.

* confusion_matrix_aut_Vis1nVis2= confusion matrix of the automatic detection compared to the intersection of the expert 1 scoring and the expert 2 scoring (i.e. when a micro-event is considered as real when both scorers marked it as such).

**Example (Matlab code):**

filename_edf='C:\DEVUYST\DataBaseSpindles\excerpt1.edf';

filename_Automatic_detection='C:\DEVUYST\DataBaseSpindles\Automatic_detection_excerpt1.txt';

filename_Visual_scoring1='C:\DEVUYST\DataBaseSpindles\Visual_scoring1_excerpt1.txt';

filename_Visual_scoring2='C:\DEVUYST\DataBaseSpindles\Visual_scoring2_excerpt1.txt';

filename_Hypnogram='C:\DEVUYST\DataBaseSpindles\Hypnogram_excerpt1.txt';

duration_event=1;%in second

[nbr_types, confusion_matrix_aut_Vis1, confusion_matrix_aut_Vis2, confusion_matrix_Vis1_Vis2, confusion_matrix_Vis2_Vis1, confusion_matrix_aut_Vis1uVis2, confusion_matrix_aut_Vis1nVis2]=assessmentMethod(filename_edf, filename_Automatic_detection, filename_Visual_scoring1, filename_Visual_scoring2, filename_Hypnogram, duration_event)

- 5=wake
- Function global_assessmentMethod compute the global results of the assessmentMethod applied to a set of excerpts.

**Inputs:**

Function global_assessmentMethod used as input:

* filepath: pathway of the files of the database.

* indice_excerpts: vector with the index of the considered excerpts.

* duration_event: the average duration (in second) of the microevent detected (for the approximation of the True Negative number #TN).

**Outputs:**

Function global_assessmentMethod returns the same outputs as the function assessmentMethod.m (but where the values were summed over all exerpts).

**Example (Matlab code):**

filepath='C:\DEVUYST\DataBaseSpindles\';

indice_excerpts=[1:6];% excerpts for the test

duration_event=1;% in second

global_assessmentMethod (filepath,indice_excerpts,duration_event)

Do "right click" and then "save target as" to download.

- Function assessmentMethod.m that compare the automatic and visual annotations of one type of micro-event, carried out on one exerpt.
- Function global_assessmentMethod.m that compute the global results of the assessmentMethod applied to a set of excerpts.

- A publication that illustrate and legitimate this assessment method (by applying it on an automatic procedures for sleep spindles detection) can be found here;
- Databases annoted in other microevents or in sleep stages can be found here;
- Other publications in the field can be found here.