The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

SPVM::Document::Language::SyntaxParsing - Syntax Parsing in SPVM Language

Description

This document describes syntax parsing in SPVM language.

Syntax Parsing

The SPVM language is assumed to be parsed by yacc/bison.

Syntax Parsing Definition

The definition of syntax parsing of SPVM language. This is written by yacc/bison syntax.

  %token <opval> CLASS HAS METHOD OUR ENUM MY USE AS REQUIRE ALIAS ALLOW CURRENT_CLASS MUTABLE
  %token <opval> ATTRIBUTE MAKE_READ_ONLY INTERFACE EVAL_ERROR_ID ARGS_WIDTH VERSION_DECL
  %token <opval> IF UNLESS ELSIF ELSE FOR WHILE LAST NEXT SWITCH CASE DEFAULT BREAK EVAL
  %token <opval> SYMBOL_NAME VAR_NAME CONSTANT EXCEPTION_VAR
  %token <opval> UNDEF VOID BYTE SHORT INT LONG FLOAT DOUBLE STRING OBJECT TRUE FALSE END_OF_FILE
  %token <opval> FATCAMMA RW RO WO INIT NEW OF BASIC_TYPE_ID EXTENDS SUPER
  %token <opval> RETURN WEAKEN DIE WARN PRINT SAY CURRENT_CLASS_NAME UNWEAKEN '[' '{' '('
  %type <opval> grammar
  %type <opval> opt_classes classes class class_block version_decl
  %type <opval> opt_definitions definitions definition
  %type <opval> enumeration enumeration_block opt_enumeration_values enumeration_values enumeration_value
  %type <opval> method anon_method opt_args args arg use require alias our has has_for_anon_list has_for_anon
  %type <opval> opt_attributes attributes
  %type <opval> opt_statements statements statement if_statement else_statement
  %type <opval> for_statement while_statement foreach_statement
  %type <opval> switch_statement case_statement case_statements opt_case_statements default_statement
  %type <opval> block eval_block init_block switch_block if_require_statement
  %type <opval> unary_operator binary_operator comparison_operator isa isa_error is_type is_error is_compile_type
  %type <opval> call_method
  %type <opval> array_access field_access weaken_field unweaken_field isweak_field convert array_length
  %type <opval> assign inc dec allow can
  %type <opval> new array_init die warn opt_extends
  %type <opval> var_decl var interface union_type
  %type <opval> operator opt_operators operators opt_operator logical_operator void_return_operator
  %type <opval> field_name method_name alias_name is_read_only
  %type <opval> type qualified_type basic_type array_type class_type opt_class_type
  %type <opval> array_type_with_length ref_type  return_type type_comment opt_type_comment
  %right <opval> ASSIGN SPECIAL_ASSIGN
  %left <opval> LOGICAL_OR
  %left <opval> LOGICAL_AND
  %left <opval> BIT_OR BIT_XOR
  %left <opval> BIT_AND
  %nonassoc <opval> NUMEQ NUMNE STREQ STRNE
  %nonassoc <opval> NUMGT NUMGE NUMLT NUMLE STRGT STRGE STRLT STRLE ISA ISA_ERROR IS_TYPE IS_ERROR IS_COMPILE_TYPE NUMERIC_CMP STRING_CMP CAN
  %left <opval> SHIFT
  %left <opval> '+' '-' '.'
  %left <opval> '*' DIVIDE DIVIDE_UNSIGNED_INT DIVIDE_UNSIGNED_LONG MODULO  MODULO_UNSIGNED_INT MODULO_UNSIGNED_LONG
  %right <opval> LOGICAL_NOT BIT_NOT '@' CREATE_REF DEREF PLUS MINUS CONVERT SCALAR STRING_LENGTH ISWEAK REFCNT TYPE_NAME COMPILE_TYPE_NAME DUMP NEW_STRING_LEN IS_READ_ONLY COPY
  %nonassoc <opval> INC DEC
  %left <opval> ARROW

  grammar
    : opt_classes

  opt_classes
    : /* Empty */
    | classes

  classes
    : classes class
    | class

  class
    : CLASS opt_class_type opt_extends class_block END_OF_FILE
    | CLASS opt_class_type opt_extends ':' opt_attributes class_block END_OF_FILE
    | CLASS opt_class_type opt_extends ';' END_OF_FILE
    | CLASS opt_class_type opt_extends ':' opt_attributes ';' END_OF_FILE

  opt_class_type
    : /* Empty */
    | class_type

  opt_extends
    : /* Empty */
    | EXTENDS basic_type

  class_block
    : '{' opt_definitions '}'

  opt_definitions
    : /* Empty */
    | definitions

  definitions
    : definitions definition
    | definition

  definition
    : version_decl
    | use
    | alias
    | allow
    | interface
    | init_block
    | enumeration
    | our
    | has ';'
    | method

  init_block
    : INIT block

  version_decl
    : VERSION_DECL CONSTANT ';'

  use
    : USE basic_type ';'
    | USE basic_type AS alias_name ';'

  require
    : REQUIRE basic_type

  alias
    : ALIAS basic_type AS alias_name ';'

  allow
    : ALLOW basic_type ';'

  interface
    : INTERFACE basic_type ';'

  enumeration
    : opt_attributes ENUM enumeration_block

  enumeration_block
    : '{' opt_enumeration_values '}'

  opt_enumeration_values
    : /* Empty */
    | enumeration_values

  enumeration_values
    : enumeration_values ',' enumeration_value
    | enumeration_values ','
    | enumeration_value

  enumeration_value
    : method_name
    | method_name ASSIGN CONSTANT

  our
    : OUR VAR_NAME ':' opt_attributes qualified_type opt_type_comment ';'

  has
    : HAS field_name ':' opt_attributes qualified_type opt_type_comment

  method
    : opt_attributes METHOD method_name ':' return_type '(' opt_args ')' block
    | opt_attributes METHOD method_name ':' return_type '(' opt_args ')' ';'
    | opt_attributes METHOD ':' return_type '(' opt_args ')' block
    | opt_attributes METHOD ':' return_type '(' opt_args ')' ';'

  anon_method
    : opt_attributes METHOD ':' return_type '(' opt_args ')' block
    | '[' has_for_anon_list ']' opt_attributes METHOD ':' return_type '(' opt_args ')' block

  opt_args
    : /* Empty */
    | args

  args
    : args ',' arg
    | args ','
    | arg

  arg
    : var ':' qualified_type opt_type_comment
    | var ':' qualified_type opt_type_comment ASSIGN operator

  has_for_anon_list
    : has_for_anon_list ',' has_for_anon
    | has_for_anon_list ','
    | has_for_anon

  has_for_anon
    : HAS field_name ':' opt_attributes qualified_type opt_type_comment
    | HAS field_name ':' opt_attributes qualified_type opt_type_comment ASSIGN operator
    | var ':' opt_attributes qualified_type opt_type_comment

  opt_attributes
    : /* Empty */
    | attributes

  attributes
    : attributes ATTRIBUTE
    | ATTRIBUTE

  opt_statements
    : /* Empty */
    | statements

  statements
    : statements statement
    | statement

  statement
    : if_statement
    | for_statement
    | foreach_statement
    | while_statement
    | block
    | switch_statement
    | case_statement
    | default_statement
    | eval_block
    | if_require_statement
    | LAST ';'
    | NEXT ';'
    | BREAK ';'
    | RETURN ';'
    | RETURN operator ';'
    | operator ';'
    | void_return_operator ';'
    | ';'
    | die ';'

  die
    : DIE operator
    | DIE
    | DIE type operator
    | DIE type
    | DIE operator ',' operator

  void_return_operator
    : warn
    | PRINT operator
    | SAY operator
    | weaken_field
    | unweaken_field
    | MAKE_READ_ONLY operator

  warn
    : WARN operator
    | WARN

  for_statement
    : FOR '(' opt_operator ';' operator ';' opt_operator ')' block

  foreach_statement
    : FOR var_decl '(' '@' operator ')' block
    | FOR var_decl '(' '@' '{' operator '}' ')' block

  while_statement
    : WHILE '(' operator ')' block

  switch_statement
    : SWITCH '(' operator ')' switch_block

  switch_block
    : '{' opt_case_statements '}'
    | '{' opt_case_statements default_statement '}'

  opt_case_statements
    : /* Empty */
    | case_statements

  case_statements
    : case_statements case_statement
    | case_statement

  case_statement
    : CASE operator ':' block
    | CASE operator ':'

  default_statement
    : DEFAULT ':' block
    | DEFAULT ':'

  if_require_statement
    : IF '(' require ')' block
    | IF '(' require ')' block ELSE block

  if_statement
    : IF '(' operator ')' block else_statement
    | UNLESS '(' operator ')' block else_statement

  else_statement
    : /* NULL */
    | ELSE block
    | ELSIF '(' operator ')' block else_statement

  block
    : '{' opt_statements '}'

  eval_block
    : EVAL block

  opt_operators
    : /* Empty */
    | operators

  opt_operator
    : /* Empty */
    | operator

  operator
    : var
    | EXCEPTION_VAR
    | CONSTANT
    | UNDEF
    | call_method
    | field_access
    | array_access
    | convert
    | new
    | array_init
    | array_length
    | var_decl
    | unary_operator
    | binary_operator
    | assign
    | inc
    | dec
    | '(' operators ')'
    | CURRENT_CLASS_NAME
    | isweak_field
    | comparison_operator
    | isa
    | isa_error
    | is_type
    | is_error
    | is_compile_type
    | TRUE
    | FALSE
    | is_read_only
    | can
    | logical_operator
    | BASIC_TYPE_ID type
    | EVAL_ERROR_ID
    | ARGS_WIDTH

  operators
    : operators ',' operator
    | operators ','
    | operator

  unary_operator
    : '+' operator %prec PLUS
    | '-' operator %prec MINUS
    | BIT_NOT operator
    | REFCNT operator
    | TYPE_NAME operator
    | COMPILE_TYPE_NAME operator
    | STRING_LENGTH operator
    | DUMP operator
    | DEREF var
    | CREATE_REF operator
    | NEW_STRING_LEN operator
    | COPY operator

  is_read_only
    : IS_READ_ONLY operator

  inc
    : INC operator
    | operator INC

  dec
    : DEC operator
    | operator DEC

  binary_operator
    : operator '+' operator
    | operator '-' operator
    | operator '*' operator
    | operator DIVIDE operator
    | operator DIVIDE_UNSIGNED_INT operator
    | operator DIVIDE_UNSIGNED_LONG operator
    | operator MODULO operator
    | operator MODULO_UNSIGNED_INT operator
    | operator MODULO_UNSIGNED_LONG operator
    | operator BIT_XOR operator
    | operator BIT_AND operator
    | operator BIT_OR operator
    | operator SHIFT operator
    | operator '.' operator

  comparison_operator
    : operator NUMEQ operator
    | operator NUMNE operator
    | operator NUMGT operator
    | operator NUMGE operator
    | operator NUMLT operator
    | operator NUMLE operator
    | operator NUMERIC_CMP operator
    | operator STREQ operator
    | operator STRNE operator
    | operator STRGT operator
    | operator STRGE operator
    | operator STRLT operator
    | operator STRLE operator
    | operator STRING_CMP operator

  isa
    : operator ISA type

  isa_error
    : operator ISA_ERROR type

  is_type
    : operator IS_TYPE type

  is_error
    : operator IS_ERROR type

  is_compile_type
    : operator IS_COMPILE_TYPE type

  logical_operator
    : operator LOGICAL_OR operator
    | operator LOGICAL_AND operator
    | LOGICAL_NOT operator

  assign
    : operator ASSIGN operator
    | operator SPECIAL_ASSIGN operator

  new
    : NEW basic_type
    | NEW array_type_with_length
    | anon_method

  array_init
    : '[' opt_operators ']'
    | '{' operators '}'
    | '{' '}'

  convert
    : '(' qualified_type ')' operator %prec CONVERT
    | operator ARROW '(' qualified_type ')' %prec CONVERT

  call_method
    : CURRENT_CLASS SYMBOL_NAME '(' opt_operators  ')'
    | CURRENT_CLASS SYMBOL_NAME
    | basic_type ARROW method_name '(' opt_operators  ')'
    | basic_type ARROW method_name
    | operator ARROW method_name '(' opt_operators ')'
    | operator ARROW method_name
    | operator ARROW '(' opt_operators ')'

  array_access
    : operator ARROW '[' operator ']'
    | array_access '[' operator ']'
    | field_access '[' operator ']'

  field_access
    : operator ARROW '{' field_name '}'
    | field_access '{' field_name '}'
    | array_access '{' field_name '}'

  weaken_field
    : WEAKEN var ARROW '{' field_name '}'

  unweaken_field
    : UNWEAKEN var ARROW '{' field_name '}'

  isweak_field
    : ISWEAK var ARROW '{' field_name '}'

  can
    : operator CAN method_name
    | operator CAN CONSTANT

  array_length
    : '@' operator
    | '@' '{' operator '}'
    | SCALAR '@' operator
    | SCALAR '@' '{' operator '}'

  var_decl
    : MY var ':' qualified_type opt_type_comment
    | MY var

  var
    : VAR_NAME

  qualified_type
    : type
    | MUTABLE type {

  type
    : basic_type
    | array_type
    | ref_type

  class_type
    : basic_type

  basic_type
    : SYMBOL_NAME
    | BYTE
    | SHORT
    | INT
    | LONG
    | FLOAT
    | DOUBLE
    | OBJECT
    | STRING

  ref_type
    : basic_type '*'

  array_type
    : basic_type '[' ']'
    | array_type '[' ']'

  array_type_with_length
    : basic_type '[' operator ']'
    | array_type '[' operator ']'

  return_type
    : qualified_type opt_type_comment
    | VOID

  opt_type_comment
    : /* Empty */
    | type_comment

  type_comment
    : OF union_type

  union_type
    : union_type BIT_OR type
    | type

  field_name
    : SYMBOL_NAME

  method_name
    : SYMBOL_NAME

  alias_name
    : SYMBOL_NAME

Syntax Parsing Token

The list of syntax parsing tokens:

TokensKeywords or operators
ALIASalias
ALLOWallow
ARROW->
ASas
ASSIGN=
BIT_AND&
BASIC_TYPE_IDbasic_type_id
BIT_NOT~
BIT_OR|
BIT_XOR^
BREAKbreak
BYTEbyte
CASEcase
CLASSclass
VAR_NAMEA variable name
COMPILE_TYPE_NAMEcompile_type_name
CONSTANTLiteral
CONVERT(TypeName)
COPYcopy
CURRENT_CLASS&
CURRENT_CLASS_NAME__PACKAGE__
DEC--
DEFAULTdefault
DEREF$
ATTRIBUTEThe name of a attribute
DIEdie
DIVIDE/
DIVIDE_UNSIGNED_INTdiv_uint
DIVIDE_UNSIGNED_LONGdiv_ulong
DOUBLEdouble
DUMPdump
ELSEelse
ELSIFelsif
END_OF_FILEThe end of the file
ENUMenum
EVAL_ERROR_IDeval_error_id
EXTENDSextends
EVALeval
EXCEPTION_VAR$@
FATCAMMA=>
FLOATfloat
FORfor
HAShas
CANcan
IFif
INTERFACEinterface
INC++
INITINIT
INTint
ISAisa
ISWEAKisweak
IS_TYPEis_type
IS_READ_ONLYis_read_only
LASTlast
LENGTHlength
LOGICAL_AND&&
LOGICAL_NOT!
LOGICAL_OR||
LONGlong
MAKE_READ_ONLYmake_read_only
METHODmethod
MINUS-
MUTABLEmutable
MYmy
SYMBOL_NAMEA symbol name
NEWnew
NEW_STRING_LENnew_string_len
OFof
NEXTnext
NUMEQ==
NUMERIC_CMP<=>
NUMGE>=
NUMGT>
NUMLE<=
NUMLT<
NUMNE!=
OBJECTobject
OURour
PLUS+
PRINTprint
REF\
TYPE_NAMEtype_name
MODULO%
MODULO_UNSIGNED_INTmod_uint
MODULO_UNSIGNED_LONGmod_ulong
REQUIRErequire
RETURNreturn
ROro
RWrw
SAYsay
SCALARscalar
SELFself
SHIFT<< >> >>>
SHORTshort
SPECIAL_ASSIGN+= -= *= /= &= |= ^= %= <<= >>= >>>= .=
SRING_CMPcmp
STREQeq
STRGEge
STRGTgt
STRINGstring
STRLEle
STRLTlt
STRNEne
SWITCHswitch
UNDEFundef
UNLESSunless
UNWEAKENunweaken
USEuse
VARvar
VERSIONversion
VOIDvoid
WARNwarn
WEAKENweaken
WHILEwhile
WOwo

Unary Operator

The unary operator is the operator that has an operand.

  UNARY_OPERATOR OPERAND

Binary Operator

The binary operator is the operator that has LEFT_OPERAND and RIGHT_OPERAND.

  LEFT_OPERAND BINARY_OPERATOR RIGHT_OPERAND

Operator Precidence

The definition of the precidence of operators. This is written by yacc/bison syntax.

The bottom is the highest precidence and the top is the lowest precidence.

  %right <opval> ASSIGN SPECIAL_ASSIGN
  %left <opval> LOGICAL_OR
  %left <opval> LOGICAL_AND
  %left <opval> BIT_OR BIT_XOR
  %left <opval> BIT_AND
  %nonassoc <opval> NUMEQ NUMNE STREQ STRNE
  %nonassoc <opval> NUMGT NUMGE NUMLT NUMLE STRGT STRGE STRLT STRLE ISA ISA_ERROR IS_TYPE IS_ERROR IS_COMPILE_TYPE NUMERIC_CMP STRING_CMP CAN
  %left <opval> SHIFT
  %left <opval> '+' '-' '.'
  %left <opval> '*' DIVIDE DIVIDE_UNSIGNED_INT DIVIDE_UNSIGNED_LONG MODULO  MODULO_UNSIGNED_INT MODULO_UNSIGNED_LONG
  %right <opval> LOGICAL_NOT BIT_NOT '@' CREATE_REF DEREF PLUS MINUS CONVERT SCALAR STRING_LENGTH ISWEAK REFCNT TYPE_NAME COMPILE_TYPE_NAME DUMP NEW_STRING_LEN IS_READ_ONLY COPY
  %nonassoc <opval> INC DEC
  %left <opval> ARROW

See also syntax parsing token to know real operators.

The operator precidence can be increased using ().

  #  a * b is calculated at first
  a * b + c
  
  # b + c is calculated at first
  a * (b + c)

Copyright & License

Copyright (c) 2023 Yuki Kimoto

MIT License