YREA-SLS/docs/lexical_structure.md

135 lines
3.4 KiB
Markdown

---
Title: 2 Lexical Structure
Prev: Overview
Next: Primitive Types
---
## 2. Lexical Structure
### 2.1 Comments and Directives
**Single-line Comments**:
```
// Single-line comment
```
**Compiler Directives**:
The language reserves the `#` character for implementation-specific compiler directives. The `#` character cannot appear in identifiers.
```
#!/usr/bin/env stack_lang
```
Directives are implementation-specific and not part of the core language specification. Implementations may support various directives for version requirements, optimization hints, or other configuration.
### 2.2 Identifiers and Constants
**Regular Identifiers**
- Start with letter or underscore: `[a-zA-Z_][a-zA-Z0-9_]*`
- Case-sensitive
- When encountered, identifiers are executed immediately
- Cannot contain `#`, whitespace, `{}`, `[]`, `()`, `.`, `'`, `"`, or `::`
**Identifier Literals**
- Prefix with `::` to push the identifier itself onto the stack instead of executing it
- Syntax: `::name` pushes the identifier `name` as a value
- Example: `::Addable` pushes the identifier `Addable` onto the stack
- Example: `::Point` pushes the identifier `Point` onto the stack
**Constants**
Constants are compile-time values that are immutable and available throughout the scope where they are defined.
**Syntax**: `value ::name const`
**Examples**:
```
3.1415926535 ::pi const
9.81 ::gravity const
299792458 ::speed_of_light const
// Use constants like any other value
10 pi * 2 / // Calculate circumference / 2
```
**Properties**:
- Constants can be evaluated at compile time
- Constants are immutable - they cannot be reassigned
- Constants defined at module scope are accessible via `Module::CONST_NAME` (when modules are implemented)
- Constants can be any literal value or compile-time expression
> **Advanced Usage**: See [Section 11.2](./advanced_topics.html#112-identifier-literals) for complete identifier literal rules and context-dependent behavior.
### 2.3 Literals
**Integer Literals**
```
42 // i64 (default)
42:i32 // Annotate as i32
0xFF // hexadecimal
0b1010 // binary
0o755 // octal
1_000_000 // underscore separators (ignored)
```
**Overflow Behavior**: Integer arithmetic wraps on overflow. For example, adding 1 to 127, the maximum value of an `i8`, produces -128.
**Floating Point Literals**
```
3.14 // f64 (default)
3.14:f32 // Annotate as f32
1_000.5 // underscore separators allowed
```
**Overflow Behavior**: Floating point operations follow IEEE 754 semantics, producing infinity or NaN for overflow/undefined operations.
**Character Literals**
```
'A' // Single character
'\n' // Escape sequence
'\x41' // Hexadecimal (A)
'\u{1F600}' // Unicode (😀)
```
**String Literals**
```
"hello world"
"escape sequences: \n \t \\ \""
```
**Escape Sequences**
- `\n` - Newline (LF)
- `\r` - Carriage return (CR)
- `\t` - Tab
- `\\` - Backslash
- `\"` - Double quote
- `\'` - Single quote
- `\0` - Null character
- `\xNN` - Hexadecimal byte (e.g., `\x41` for 'A')
- `\u{NNNN}` - Unicode code point (e.g., `\u{1F600}` for 😀)
**Boolean Literals**
```
true
false
```
**Array Literals**
```
[1 2 3 4 5] // array of i64
[1.0 2.0 3.0] // array of f64
[[1 2] [3 4]] // 2D array
// Typed arrays
[1 2 3 :i16] // array of i16
[[1.0 2.0] [3.0 4.0] :f32] // 2D array of f32
// Arrays of structs (when Point is defined)
[{1 2} {3 4} :Point] // array of Point structs
```
---