File size: 41,793 Bytes
9d34438
 
 
 
 
 
d5019b3
9d34438
 
 
d5019b3
9d34438
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
d5019b3
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5019b3
 
 
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5019b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5019b3
9d34438
 
d5019b3
 
 
 
9d34438
d5019b3
 
 
 
 
9d34438
 
 
 
 
 
 
 
 
 
 
 
d5019b3
 
 
9d34438
 
 
 
 
 
 
 
 
 
d5019b3
 
9d34438
 
 
 
 
 
 
 
 
 
 
d5019b3
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5019b3
 
9d34438
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:21376
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: Explain the purpose of source authentication in EPIC.
  sentences:
  - 'Golden file testing is integral to the SCION testing suite, ensuring consistency
    in test outputs. The -update flag is utilized across all packages containing golden
    file tests, allowing for systematic updates. To update all golden files, the command
    `go test ./... -update` is executed, while a specific package can be updated using
    `go test ./path/to/package -update`. The update mechanism is implemented via a
    package global variable: `var update = xtest.UpdateGoldenFiles()`.


    For tests involving non-deterministic elements, such as private keys and certificates,
    a separate flag, -update-non-deterministic, is employed. This allows for the updating
    of non-deterministic golden files with the command `go test ./... -update-non-deterministic`
    or for a specific package using `go test ./path/to/package -update-non-deterministic`.
    The corresponding global variable for this functionality is defined as `var updateNonDeterministic
    = xtest.UpdateNonDeterminsticGoldenFiles()`. This structured approach facilitates
    the management of both deterministic and non-deterministic golden files within
    the SCION architecture.'
  - EPIC introduces a family of cryptographic data-plane protocols designed to enhance
    security in path-aware Internet architectures by addressing the security-efficiency
    dilemma. The protocols facilitate source authentication, path validation, and
    path authorization while maintaining low communication overhead. EPIC employs
    short per-hop authentication fields to minimize overhead, ensuring that even if
    an attacker forges an authenticator, they cannot exploit it to launch volumetric
    DoS attacks. The design binds authenticators to specific packets, preventing further
    malicious packet transmission. Additionally, EPIC utilizes a longer, unforgeable
    authentication field for the destination, allowing detection of any deceptive
    packets that may have bypassed intermediate routers. The proposed attacker model
    combines a localized Dolev–Yao adversary with a cryptographic oracle, demonstrating
    EPIC's resilience against powerful attackers. EPIC's communication overhead is
    3–5 times smaller than existing solutions like OPT and ICING for realistic path
    lengths. Implementation using Intel’s Data Plane Development Kit (DPDK) shows
    that EPIC can saturate a 40 Gbps link on commodity hardware with only four processing
    cores. The focus is on securing inter-domain data-plane communication while assuming
    a secure control plane for key distribution and path construction.
  - The document chunk provides a comprehensive index of SCION Internet Architecture,
    detailing key components and concepts. It includes references to various types
    of certificates such as AS, CA, root, and voting certificates, essential for the
    SCION Control-plane PKI (CP-PKI). The control plane and data plane are delineated,
    with specific focus on control-plane extensions, including hidden paths and time
    synchronization, and data-plane extensions like end-to-end (E2E) and hop-by-hop
    (HBH) mechanisms. The COLIBRI framework is highlighted, encompassing control and
    data plane functionalities, end-to-end reservations (EER), and security analyses.
    Cryptographic algorithms are categorized, emphasizing agility, asymmetric, symmetric,
    and post-quantum methods, alongside cryptographic hash functions. Deployment scenarios
    are outlined, addressing customer site, end host, and ISP core network configurations.
    The document also references the Discovery service and DNSSEC, noting their relevance
    to SCION's operational integrity. Overall, the index encapsulates the architectural
    components, algorithms, and deployment strategies critical to SCION's design and
    implementation.
- source_sentence: How does SCION's deployment model differ from traditional overlay
    networks?
  sentences:
  - The document discusses the deployment and scalability of the SCION Internet architecture,
    emphasizing its path-diversity-based path construction algorithm for core beaconing
    as the SCIONLab network expands. It contrasts SCION with existing testbeds like
    VINI, GENI, and FABRIC, which have not yet facilitated SCION's production network,
    highlighting SCION's unique design that avoids overlay complexities. The section
    on related work outlines various Internet architecture proposals, including Trotsky,
    which advocates for a backward-compatible framework for new inter-domain protocols,
    and RON, an application-level overlay network that lacks the guarantees of native
    architectures. SCION's architecture distinctly separates inter-domain communication
    (core-path segments) from intra-domain communication (down- and up-path segments),
    paralleling concepts in Plutarch and HLP, which employs a hybrid routing approach.
    SCION's beaconing mechanism enhances scalability compared to BGP, while maintaining
    a hierarchical network partitioning akin to HLP's model. The document also references
    XIA's principal-centric networking and the Framework for Internet Innovation's
    clean-slate redesign, noting SCION's alignment with these innovative paradigms
    while addressing deployment challenges.
  - The document discusses the limitations of the Border Gateway Protocol (BGP) within
    the context of internet infrastructure, emphasizing its lack of guarantees for
    data delivery and path specification. BGP facilitates the exchange of reachability
    information among Autonomous Systems (ASes) through IP prefix advertisements,
    enabling ASes to make routing decisions based on various factors, including cost
    and load balancing. However, BGP's design, originating in 1989 as a temporary
    solution, did not prioritize security, leading to vulnerabilities such as route
    hijacking. These deficiencies impact both Quality of Service (QoS) and Quality
    of Experience (QoE), as users remain unaware of the paths taken by their data.
    The document highlights the need for improved protocols that address these security
    and performance issues, particularly in the context of SCION architecture, which
    aims to provide more reliable and secure routing mechanisms.
  - "The document chunk details the packet processing logic within the SCION Data\
    \ Plane, specifically focusing on the handling of Hop Fields and Accumulator values\
    \ during packet traversal. It outlines checks for link types to prevent valley\
    \ usage and ensures that timestamps in the Info Field are valid. The processing\
    \ steps vary based on the Construction Direction flag (C) and the Peering flag\
    \ (P). \n\nFor packets traveling in construction direction (C = \"1\"), the process\
    \ continues to the next step. If traveling against construction direction (C =\
    \ \"0\"), three cases are considered: \n\n1. **No Peering Hop Field (P = \"0\"\
    )**: The ingress border router computes the Accumulator (Acc) using the formula\
    \ Acc = Acc_(i+1) XOR MAC_i, updates the Acc field, verifies the MAC, and increments\
    \ path metadata if at the last Hop Field.\n\n2. **Peering Hop Field Present but\
    \ Not Current (P = \"1\")**: Similar to Case 1, but without incrementing path\
    \ metadata.\n\n3. **Current Hop Field is Peering (P = \"1\")**: The router computes\
    \ MAC^Peer_i, verifies it against the current MAC, and increments path metadata\
    \ if applicable. \n\nThese steps ensure integrity and proper routing within the\
    \ SCION architecture."
- source_sentence: What considerations are taken into account for Assigned SCION Protocol
    Numbers?
  sentences:
  - "Redundant transmission in SCION enhances the reliability of media streams, specifically\
    \ for video calls, by employing multiple relay connections. The application allows\
    \ for configurable redundant and multi-redundant transmission of audio and video\
    \ streams. Each media stream can utilize more than one relay connection, with\
    \ outgoing packets sent identically across these connections. Received packets\
    \ are forwarded to WebRTC, which utilizes RTP and RTCP protocols for connection\
    \ management, including sequence numbering and deduplication. \n\nOverlap path\
    \ processors facilitate the selection of paths for redundant relay connections.\
    \ The process begins with sorting the relay connections arbitrarily, followed\
    \ by the first connection utilizing the standard root path processor. Subsequent\
    \ connections employ an overlap path processor that references the path of the\
    \ preceding connection, ensuring that all paths are considered as reference paths\
    \ for redundancy. This design aims to maintain Quality of Service (QoS) by ensuring\
    \ that at least one relay connection successfully delivers packets, mitigating\
    \ packet loss.\n\nHowever, challenges arise from latency discrepancies between\
    \ redundant connections, which can lead to a primary connection (lead) and a secondary\
    \ connection (backup) scenario, potentially affecting the overall performance\
    \ of the media stream."
  - The document chunk addresses the availability guarantees within the SCION Internet
    Architecture, detailing the adversary model and availability properties. It outlines
    defense systems, including basic fault and attack isolation, protection mechanisms
    for data-plane traffic, and control-plane services. Key components include secure
    path discovery and dissemination, data delivery mechanisms, packet authentication,
    and filtering strategies. The section on traffic prioritization introduces traffic
    classes, Setup-Less Neighbor-Based Communication (SNC), traffic marking, and priority
    processing through queuing disciplines. Additionally, it discusses protected DRKey
    bootstrapping, emphasizing assumptions, the protection of COLIBRI SegR setup,
    and the Telescoped Reservation Setup (TRS) with a corresponding security analysis.
    The protection of control-plane services is further elaborated, focusing on criticality
    criteria for interactions, filtering at the control service, and the attack resilience
    of the control service. The chapter concludes with a discussion on AS certification,
    reinforcing the importance of these mechanisms in ensuring the robustness and
    reliability of SCION's architecture against various threats.
  - The document chunk outlines the SCION Data Plane architecture, detailing the SCION
    Header Specification, which includes various header formats such as Address Header,
    SCION Path Types (SCION, EmptyPath, OneHopPath, EPIC-HP), and the Pseudo Header
    for Upper-Layer Checksum. It specifies the SCION Extension Header, encompassing
    Hop-by-Hop and End-to-End Options Headers, along with TLV-encoded Options. The
    SCMP Specification is introduced, covering its general format, error messages,
    and informational messages. The SCION Packet Authenticator Option is defined,
    detailing its format, absolute time, DRKey selection, authenticated data, and
    associated algorithms. Additionally, the document addresses BFD (Bidirectional
    Forwarding Detection) on top of SCION, including protocol specifics and its implementation
    in SCION routers. It also lists assigned SCION Protocol Numbers, highlighting
    considerations for their assignment, and concludes with an overview of the SCION
    Protocol Stack, which integrates these components into a cohesive architecture.
- source_sentence: What happens when a local path service cannot find a path to a
    remote AS?
  sentences:
  - The document chunk details the verification of Go programs within the SCION architecture,
    specifically focusing on the `UnmarshalText` function for parsing Autonomous System
    (AS) identifiers. The `UnmarshalText` function, defined in the `addr` package,
    takes a byte slice `text`, converts it to a string, and invokes `ASFromString`
    to derive a valid AS identifier. The function employs preconditions and postconditions
    to ensure memory safety and validity of the AS identifier. The `validAS` function
    is a pure function that checks the validity of the AS identifier, returning true
    only for valid inputs. The use of the `old` keyword in postconditions guarantees
    that the state of the AS variable remains unchanged if the input string is invalid,
    ensuring that the function adheres to the principles of functional programming
    by avoiding side effects. The implementation details emphasize the importance
    of error handling, where `ASFromString` returns a non-nil error for invalid identifiers,
    thus maintaining the integrity of the AS state. This snippet is part of the SCION
    codebase, specifically from `github.com/scionproto/scion/go/lib/addr/isdas.go`.
  - The document chunk discusses extensions for the SCION data plane, focusing on
    the COLIBRI system's attack-resistant resource-reservation mechanisms. It identifies
    challenges such as per-flow state management, path stability, and authentication
    overhead, alongside corresponding enabling technologies. Key solutions include
    packet-carried state for fast path management, path choice strategies to mitigate
    on-path adversaries, and symmetric-key authentication to address signature overhead.
    The architecture leverages Isolation Domains (ISDs) and segment types to enhance
    scalability by decomposing reservations. Reservation protection is emphasized,
    requiring data packets to carry cryptographically protected information for validation
    and attribution. Efficient cryptographic mechanisms are critical, particularly
    for per-packet MACs to authenticate data packets at each on-path Autonomous System
    (AS). The DRKey framework underpins the efficient authentication of control-plane
    packets, mitigating risks such as signature flooding and denial-of-capability
    (DoC) attacks. Overall, the integration of these components within SCION's architecture
    enhances the robustness and efficiency of resource reservations in adversarial
    environments.
  - The document discusses the optimization of path registration and propagation in
    SCION Autonomous Systems (ASes) through beacon services, utilizing information
    encoded in Path Control Blocks (PCBs). Beacon services can optimize paths based
    on various quality metrics, such as latency and bandwidth, either jointly or in
    parallel, depending on local implementations. For path lookup, a host within a
    SCION AS must obtain the destination host's ISD number, AS number, and local address,
    resulting in tuples of the form ⟨ISD number, AS number, local address⟩. The host
    queries its path service for a registered or cached path to the destination. If
    unavailable, it escalates the request to a core AS's path service within its ISD.
    If the destination is within the same ISD or is a core AS in another ISD, the
    core path service provides a path segment. If not, the local core path service
    queries the remote core path service of the destination's ISD, which returns a
    down-segment. The local core path service then returns both the core-segment and
    down-segment to facilitate routing. The structure of a PCB is detailed, highlighting
    fields for AS hop metadata and extensions for additional information dissemination.
- source_sentence: How does SCION's time synchronization support duplicate detection
    and traffic monitoring?
  sentences:
  - The document chunk details the COLIBRI system for bandwidth reservations within
    the SCION architecture, emphasizing its efficiency in managing end-to-end reservations
    (EERs) across on-path Autonomous Systems (ASes). COLIBRI employs symmetric cryptography
    for stateless verification of reservations, mitigating the need for per-source
    state. To counter replay attacks, a duplicate suppression mechanism is essential,
    ensuring that authenticated packets cannot be maliciously reused to exceed allocated
    bandwidth. Monitoring and policing systems are necessary to enforce bandwidth
    limits, allowing ASes to identify and manage misbehaving flows. While stateful
    flow monitoring, such as the token-bucket algorithm, provides precise measurements,
    it is limited to edge deployments due to state requirements. In contrast, probabilistic
    monitoring techniques like LOFT can operate effectively within the Internet core,
    accommodating a vast number of flows. Time synchronization is critical for coordinating
    bandwidth reservations across AS boundaries, facilitating accurate duplicate detection
    and traffic monitoring. SCION’s time synchronization mechanism ensures adequate
    synchronization levels for these operations. Overall, COLIBRI enables efficient
    short-term bandwidth reservations, crucial for maintaining the integrity of end-to-end
    communications in a highly dynamic Internet environment.
  - 'SCION is a next-generation Internet architecture designed to enhance security,
    availability, isolation, and scalability. It enables end hosts to utilize multiple
    authenticated inter-domain paths to any destination, with each packet carrying
    its own specified forwarding path determined by the sender. SCION architecture
    incorporates isolation domains (ISDs), which consist of multiple Autonomous Systems
    (ASes) that agree on a trust root configuration (TRC) defining the roots of trust
    for validating bindings between names and public keys or addresses. Core ASes
    govern each ISD, providing inter-ISD connectivity and managing trust roots. The
    architecture delineates three types of AS relationships: core, provider-customer,
    and peer-peer, with core relations existing solely among core ASes. SCION addressing
    is structured as a tuple ⟨ISD number, AS number, local address⟩, where the ISD
    number identifies the ISD of the end host, the AS number identifies the host''s
    AS, and the local address serves as the host''s identifier within its AS, not
    utilized for inter-domain routing or forwarding. Path-construction, path-registration,
    and path-lookup procedures are integral to SCION''s operational framework, facilitating
    efficient packet forwarding based on the specified paths.'
  - The document introduces a formal verification framework for path-aware data plane
    protocols, addressing the critical security property of path authorization in
    Internet architectures. It utilizes Isabelle/HOL to develop a parameterized model
    that first establishes path authorization without an attacker, then refines it
    by incorporating an attacker and cryptographic validation fields. The framework
    is parameterized by the protocol's authentication mechanism and relies on five
    verification conditions sufficient for proving the refinement of the abstract
    model. The authors validate the framework against several existing protocols,
    demonstrating compliance with the verification conditions and ensuring path authorization
    without requiring invariant proofs. This approach supports low-effort security
    proofs applicable to arbitrary network topologies and authorized paths, surpassing
    the capabilities of current automated security protocol verifiers. The document
    emphasizes the importance of formal verification in enhancing the security and
    reliability of future Internet architectures, particularly in light of the limitations
    of the existing Border Gateway Protocol (BGP) and the need for robust, scalable
    solutions. Path-aware architectures empower end hosts to select forwarding paths
    while ensuring adherence to autonomous systems' routing policies, thereby mitigating
    risks from malicious sources.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dev evaluation
      type: dev-evaluation
    metrics:
    - type: cosine_accuracy@1
      value: 0.2032690695725063
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.44090528080469404
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5368818105616094
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.662615255658005
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.2032690695725063
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.146968426934898
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.10737636211232188
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.0662615255658005
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.2032690695725063
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.44090528080469404
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5368818105616094
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.662615255658005
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.4209375313714088
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.3449352372969312
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.35689195077195485
      name: Cosine Map@100
---

# SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision fa97f6e7cb1a59073dff9e6b13e2715cf7475ac9 -->
- **Maximum Sequence Length:** 256 tokens
- **Output Dimensionality:** 384 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("tjohn327/scion-all-MiniLM-L6-v2")
# Run inference
sentences = [
    "How does SCION's time synchronization support duplicate detection and traffic monitoring?",
    'The document chunk details the COLIBRI system for bandwidth reservations within the SCION architecture, emphasizing its efficiency in managing end-to-end reservations (EERs) across on-path Autonomous Systems (ASes). COLIBRI employs symmetric cryptography for stateless verification of reservations, mitigating the need for per-source state. To counter replay attacks, a duplicate suppression mechanism is essential, ensuring that authenticated packets cannot be maliciously reused to exceed allocated bandwidth. Monitoring and policing systems are necessary to enforce bandwidth limits, allowing ASes to identify and manage misbehaving flows. While stateful flow monitoring, such as the token-bucket algorithm, provides precise measurements, it is limited to edge deployments due to state requirements. In contrast, probabilistic monitoring techniques like LOFT can operate effectively within the Internet core, accommodating a vast number of flows. Time synchronization is critical for coordinating bandwidth reservations across AS boundaries, facilitating accurate duplicate detection and traffic monitoring. SCION’s time synchronization mechanism ensures adequate synchronization levels for these operations. Overall, COLIBRI enables efficient short-term bandwidth reservations, crucial for maintaining the integrity of end-to-end communications in a highly dynamic Internet environment.',
    "SCION is a next-generation Internet architecture designed to enhance security, availability, isolation, and scalability. It enables end hosts to utilize multiple authenticated inter-domain paths to any destination, with each packet carrying its own specified forwarding path determined by the sender. SCION architecture incorporates isolation domains (ISDs), which consist of multiple Autonomous Systems (ASes) that agree on a trust root configuration (TRC) defining the roots of trust for validating bindings between names and public keys or addresses. Core ASes govern each ISD, providing inter-ISD connectivity and managing trust roots. The architecture delineates three types of AS relationships: core, provider-customer, and peer-peer, with core relations existing solely among core ASes. SCION addressing is structured as a tuple ⟨ISD number, AS number, local address⟩, where the ISD number identifies the ISD of the end host, the AS number identifies the host's AS, and the local address serves as the host's identifier within its AS, not utilized for inter-domain routing or forwarding. Path-construction, path-registration, and path-lookup procedures are integral to SCION's operational framework, facilitating efficient packet forwarding based on the specified paths.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Dataset: `dev-evaluation`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.2033     |
| cosine_accuracy@3   | 0.4409     |
| cosine_accuracy@5   | 0.5369     |
| cosine_accuracy@10  | 0.6626     |
| cosine_precision@1  | 0.2033     |
| cosine_precision@3  | 0.147      |
| cosine_precision@5  | 0.1074     |
| cosine_precision@10 | 0.0663     |
| cosine_recall@1     | 0.2033     |
| cosine_recall@3     | 0.4409     |
| cosine_recall@5     | 0.5369     |
| cosine_recall@10    | 0.6626     |
| **cosine_ndcg@10**  | **0.4209** |
| cosine_mrr@10       | 0.3449     |
| cosine_map@100      | 0.3569     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset

* Size: 21,376 training samples
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
  |         | sentence_0                                                                        | sentence_1                                                                          | label                                                         |
  |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------|
  | type    | string                                                                            | string                                                                              | float                                                         |
  | details | <ul><li>min: 7 tokens</li><li>mean: 17.45 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 231.74 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
* Samples:
  | sentence_0                                                                                                 | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | label            |
  |:-----------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
  | <code>What are the advantages of using flow-volume targets over cash transfers in SCION agreements?</code> | <code>The document discusses optimization strategies for interconnection agreements in SCION, focusing on flow-volume targets and cash compensation mechanisms. The optimization problem involves determining the maximum customer demand for new path segments, denoted as ∆fmax P, and adjusting flow allowances f(a) and additional traffic ∆f(a) accordingly. Flow-volume targets provide predictability and enforceable limits, enhancing the likelihood of positive utility outcomes. Conversely, cash compensation agreements, defined by a payment π between Autonomous Systems (ASes), offer flexibility and can be concluded even when flow-volume targets yield zero solutions. The Nash Bargaining Solution is employed to determine the cash transfer that balances utilities uD(a) and uE(a) of the negotiating parties. Both methods face challenges due to private information regarding costs and pricing, which can lead to inefficiencies in negotiations. The document introduces a mechanism-assisted negotiation approac...</code> | <code>1.0</code> |
  | <code>What is the significance of the variable 'auth⇄a' in packet forwarding?</code>                       | <code>The document chunk presents a formal model for secure packet forwarding protocols within the SCION architecture, detailing the dispatch and handling of packets across internal and external channels. The functions `dispatch-inta`, `dispatch-intc`, `dispatch-exta`, and `dispatch-extc` manage the addition of packets to internal and external send queues based on their authorization status and historical context. The model defines packet structures (`PKTa`) that encapsulate the future path (`fut`), past path (`past`), and historical path (`hist`), with authorization checks against the set of authorized paths (`autha`) and their fragment closures (`auth⇄a`). The environment parameters include the set of compromised nodes (`Nattr`), which influences the security model by introducing potential adversarial actions. The state representation employs asynchronous channels for packet transmission, with the initial state having all channels empty. The model emphasizes the importance of maintaining i...</code> | <code>1.0</code> |
  | <code>How can SNC and COLIBRI SegRs be combined to enhance DRKey bootstrapping security?</code>            | <code>The document chunk discusses the Protected DRKey Bootstrapping mechanism within the SCION architecture, emphasizing the integration of Service Network Controllers (SNC) and COLIBRI Segments (SegRs) to enhance the security of DRKey bootstrapping. This approach mitigates the denial-of-DRKey attack surface by ensuring that pre-shared DRKeys are securely established. The adversary model considers off-path adversaries capable of modifying, dropping, or injecting packets, while maintaining that the reservation path remains uncontested by adversaries. The document highlights the necessity of path stability in the underlying network architecture for effective bandwidth-reservation protocols. Various queuing disciplines, such as distributed weighted fair queuing (DWFQ) and rate-controlled priority queuing (PQ), are discussed for traffic isolation, with specific mechanisms to prevent starvation of lower-priority traffic. Control-plane traffic is rate-limited at infrastructure services and border...</code> | <code>1.0</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 128
- `per_device_eval_batch_size`: 128
- `num_train_epochs`: 2
- `fp16`: True
- `multi_dataset_batch_sampler`: round_robin

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 128
- `per_device_eval_batch_size`: 128
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin

</details>

### Training Logs
| Epoch | Step | dev-evaluation_cosine_ndcg@10 |
|:-----:|:----:|:-----------------------------:|
| 1.0   | 84   | 0.4114                        |
| 2.0   | 168  | 0.4209                        |


### Framework Versions
- Python: 3.12.3
- Sentence Transformers: 3.4.1
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.4.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->