6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 8:11 pm

All times are UTC




Post new topic Reply to topic  [ 103 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 7  Next
Author Message
PostPosted: Wed Oct 02, 2024 5:48 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
barnacle wrote:
Thanks Yuri, you nailed it with Thought 2... a missing definition for value, which should be local. Not at all sure where it was picking up a global, though, but there is another 'value' in Expression(), which is recursively called.


Sounds like a case of a shadowed variable. I was under the impression compilers would warn about that by default, but apparently not as I checked GCC's behavior on this.

In anyevent:
cc -Wshadow

Will warn about that situation.

Usually when writing C/C++ I turn the maximum amount of warnings on as I possibly can, for GCC/Clang that usually looks something akin to:
cc -Wall -Wpedantic -Werror -Wshadow <all other options>

Your millage may vary depending on the compiler in use.

barnacle wrote:
(Yes, I do use lots of strictly not-necessary braces; I tend to use the MISRA model (and have to, for work)).


Megh, the addition/removal of braces aren't usually a problem for me.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 02, 2024 6:54 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
Hmm. I need to check the scope visibility rules.

Expression() has a variable 'value' which is set by either the return value of Term(), or has the return value of Term() added to or subtracted from it.

Term() has another variable, also now called 'value', which is set similarly by the return value of Factor(), or is multiplied or divided thereby.

As you can tell, I was surprised that Term() could see a variable in a routine which called it which is not explicitly global; I know this is expected if the variable is e.g. referenced in an if/then clause or a loop, but I didn't expect that to carry through to a subroutine. After all, a new stack frame has been built at that point.

The mistake I made was a failure to redefine the variable in the scope; the surprise was the compiler wandering off and find one to use.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 02, 2024 7:18 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
FWIW this is the version I came up with getting past the section 2 of Cernshaw's tutorial:
Code:
#include <stdio.h>
#include <stdarg.h>
#include <stdint.h>
#include <stdlib.h>
#include <ctype.h>

// Lower this for smaller systems
#define MAX_BUFFER_SIZE 1024

/**************************/
/**** INPUT FUNCTIONS  ****/
/**************************/

char g_look;

void getChar()
{
    g_look = fgetc(stdin);
}

/**************************/
/**** OUTPUT FUNCTIONS ****/
/**************************/

void verrorf(const char *err, va_list args)
{
    char buf[MAX_BUFFER_SIZE];

    vsnprintf(buf, sizeof(buf), err, args);
    fputs(buf, stderr);
    fputs("\r\n", stderr);
}

void errorf(const char *err, ...)
{
    va_list args;

    va_start(args, err);
    verrorf(err, args);
    va_end(args);
}

void vabortf(const char *err, va_list args)
{
    char buf[MAX_BUFFER_SIZE];

    vsnprintf(buf, sizeof(buf), err, args);
    fputs(buf, stderr);
    fputs("\r\n", stderr);
    exit(-1);
}

void abortf(const char *err, ...)
{
    va_list args;

    va_start(args, err);
    vabortf(err, args);
    va_end(args); // Doesn't actually get here
}

void vemitf(const char *str, va_list args)
{
    char buf[MAX_BUFFER_SIZE];

    vsnprintf(buf, sizeof(buf), str, args);
    printf("    %s\r\n", buf);
}

void emitf(const char *str, ...)
{
    va_list args;
   
    va_start(args, str);
    vemitf(str, args);
    va_end(args);
}

/**************************/
/**** LEXING           ****/
/**************************/

void match(char x)
{
    if (g_look != x)
    {
        abortf("'%c' expected", x);
    }

    getChar();
}

char getName()
{
    char rval;

    if (!isalpha(g_look))
    {
        abortf("Name expected");
    }

    rval = (char)toupper(g_look);
    getChar();
    return rval;
}

int16_t getNum()
{
    int16_t rval;

    if (!isdigit(g_look))
    {
        abortf("Digit expected");
    }

    rval = (int16_t)(g_look - '0');
    getChar();
    return rval;
}

/**************************/
/**** PARSING          ****/
/**************************/

void parseFactor();
void parseTerm();
void parseExpression();

void parseFactor()
{
    if (g_look == '(')
    {
        match('(');
        parseExpression();
        match(')');
    }
    else
    {
        emitf("MOVE #%d, D0", getNum());
    }
}

void parseTerm()
{
    parseFactor();

    while (g_look == '*' || g_look == '/')
    {
        emitf("MOVE D0,-(SP)");

        switch (g_look)
        {
        case '*':
            match('*');
            parseFactor();
            emitf("MULS (SP)+, D0");
            break;

        case '/':
            match('/');
            parseFactor();
            emitf("MOVE (SP)+, D1");
            emitf("DIVS D1, D0");
            break;

        default:
            abortf("Unexpected token: '%c'", g_look);
            break;
        }
    }
}

void parseExpression()
{
    if (g_look == '+' || g_look == '-') // Handles unary + and -
        emitf("CLR D0");
    else
        parseTerm();

    while (g_look == '+' || g_look == '-')
    {
        emitf("MOVE D0,-(SP)");

        switch (g_look)
        {
        case '+':
            match('+');
            parseTerm();
            emitf("ADD (SP)+, D0");
            break;

        case '-':
            match('-');
            parseTerm();
            emitf("SUB (SP)+, D0");
            emitf("NEG D0");
            break;

        default:
            abortf("Unexpected token: '%c'", g_look);
            break;
        }
    }
}

/**************************/
/**** MAIN             ****/
/**************************/

void init()
{
    getChar();
}

int main(void)
{
    init();
    parseExpression();
    return 0;
}


Build command:
Code:
/usr/bin/cc   -std=c99 -Wall -Wextra -Wpedantic -Werror -o main.o -c main.c
cc main.o -o crenshaw


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 02, 2024 7:29 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
I should note that he points out that the handling of the unary + and - operators is not the best in his implementation. I tend to agree, though I don't think it needs to be nearly as complex of a problem to solve as he suggests later on in his optimization section. You can handle unary + and - if you consider it like another nested set of rules for your recursive decent parser, this is how I handle it in my implementation for the Orange Soda bootstrap compiler.

Modifying his example this is the way I would handle it:
Code:

void parseUnary()
{
    int negate = 0;

    if (g_look == '+')
    {
        match('+'); // Eat +
    }
    else if (g_look == '-')
    {
        match('-');
        negate = 1;
    }

    parseFactor();

    if (negate)
    {
        emitf("NEG D0");
    }
}

void parseTerm()
{
    parseUnary();

    while (g_look == '*' || g_look == '/')
    {
        emitf("MOVE D0,-(SP)");

        switch (g_look)
        {
        case '*':
            match('*');
            parseUnary();
            emitf("MULS (SP)+, D0");
            break;

        case '/':
            match('/');
            parseUnary();
            emitf("MOVE (SP)+, D1");
            emitf("DIVS D1, D0");
            break;

        default:
            abortf("Unexpected token: '%c'", g_look);
            break;
        }
    }
}

void parseExpression()
{
    parseTerm();

    while (g_look == '+' || g_look == '-')
    {
        emitf("MOVE D0,-(SP)");

        switch (g_look)
        {
        case '+':
            match('+');
            parseTerm();
            emitf("ADD (SP)+, D0");
            break;

        case '-':
            match('-');
            parseTerm();
            emitf("SUB (SP)+, D0");
            emitf("NEG D0");
            break;

        default:
            abortf("Unexpected token: '%c'", g_look);
            break;
        }
    }
}


Of note, the parseUnary() is where I would also handle things like the C style operators such as '!' and '~'.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 02, 2024 8:49 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
I have an alternate GetNum() that applies that unary operation internally and multiplies by -1 (because inverse and add one is easy rather than working with stuff on the stack) which I will probably integrate later.

A quick look at Expression() gives a gratifyingly simple 65c02 (untested! Not even test assembled!) version. It uses a model with Y:A as an input and output for 16-bit parameters and results and local variable stored on the stack. A called routine can also read a variable on a its parent's stack, naughty, but easy, by setting X to the stack pointer: lda 0x101,x gives the low byte of the return address, lda 0x102,x gives the parents most recent variable. The caller has to know what to do, of course, but it's simple and consistent.

Code:
expression:
   lda Look         
   jsr isAddOp         ; if isAddOp(Look)
   beq   exp_1
      ldx #0          ; no, push zero as 'value'
      phx
      phx
   bra exp_2
exp_1:
      jsr Term()      ; result is in Y:A
      phy
      pha             ; so save that as 'value'
exp_2:
   lda Look   
   jsr isAddOp         ; non-zero if + or -
   beq exp_ret           
      lda Look        ; switch (Look)
      cmp #'+'
      bne exp_sw_2
         jsr Matc    ; acc is already '+'
         jsr Term    ; returns value in Y:A
         jsr pluseq  ; adds Y:A to stack data above return address
                        ; (which is 'value', to Y:A, and saves the result
                        ; to the same place, updating 'value'
      bra exp_ret
exp_sw_2:
         jsr Match   ; acc must be '-'
         jsr Term
         jsr minuseq ; subtract term from value
exp_ret:
   pla
   ply                 ; pull 'value' from stack and into Y:A
   ret   

So pluseq and minuseq behave like this.

Neil

edit: improved version of expression, though still untested. Annoying that the code tags do strange things to formatting.
edit2: corrected stack offset


Last edited by barnacle on Thu Oct 03, 2024 7:29 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 03, 2024 6:26 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
And a possible way of doing pluseq:
Code:
; void pluseq (int16_t)
; add Y:A to the value on the stack immediately above the return address
; i.e. the last parameter placed by the calling routine
pluseq:
   tsx
   clc
   adc 0x103,x
   sta 0x103,x
   tya
   adc 0x104,x
   sta 0x104,x
   ret

minuseq will be similar; muleq and diveq will be a bit more complex but work the same way.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 03, 2024 7:35 am 
Offline

Joined: Sun May 07, 2017 3:59 pm
Posts: 21
barnacle wrote:
Annoying that the code tags do strange things to formatting.

Are you talking about the indentation? At first I thought that was intentional to make it easier to follow the branching. After reading this comment, my guess is you have mixed spaces and tabs in your code :)


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 03, 2024 8:06 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
Originally, I used all tabs as is my habit. Somewhere along the line they got translated to spaces and no longer lined up. So I tried again with all spaces... and it still ended up misaligned. As you say, probably one or two tabs left in.

The indentation is intentional to show branching; but when the comments left me they were all lined up nicely :mrgreen:

Neil


Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 04, 2024 6:47 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
I think this is now somewhere useful. Pointed at text in memory, it will correctly evaluate an infix expression, properly ignoring leading and trailing spaces. I've included the usual arithmetic operators and also, given its eventual use, bitwise logical operators. Precedence is logic, then multiplication/division, then finally addition/subtraction. This can be overridden using brackets as required - though this requires all routines to hold their own data on the stack because it's recursive. There are no unary operators at the moment though I may add them later, and the only radix is decimal.

Suitable for a tiny basic, it also accepts 26 variables A-Z (not case sensitive). I preload them with values from 0-25 for testing but in the real world that would be unnecessary of course. There is as yet no assignment operator; that will be further in the process.

When an error is found, an error variable is set and further evaluation halted; the returned value is undefined. This differs from Dr Crenshaw's original as he just drops straight down to the OS; I need to complete the operation to unwind the stack and then stop.

This is all very simple C to make it easy to convert to assembly - see earlier posts in this thread for examples. I've not included all the routines for brevity but they should be obvious.

Look is a global char which holds the last character read by GetChar(); mem_ptr is a global pointer to char.

SkipSpaces does what you expect but because it uses GetChar, it also loads Look with the next character to process.

This is called before everything else; it sets up the start of the expression to be evaluated, and (temporarily) preloads the variables.
Code:
void Init (char * mem)
{
   mem_ptr = mem;
   err = ERR_NONE;
   for (int q = 0; q < 26; q++)
   {
      vars[q] = q;
   }
   SkipWhite();
   GetChar();
   SkipWhite();
}

Expression reads either a term, or adds/subtracts further terms as long as there are any - evaluation is left to right.
Code:
int16_t Expression (void)
{
   int16_t value;

   if (IsAddOp(Look))
   {
      value = 0;
   }
   else
   {
      value = Term();
   }
   if (ERR_NONE == err)
   {
      while (IsAddOp(Look))
      {
         // there are only two possibilities
         if ('+' == Look)
         {
            Match ('+');
            value += Term();
         }
         else
         {
            Match ('-');
            value -= Term();
         }
      }
   }
   return value;
}

Term() is similar, but rather than looking for * or / it's expecting logic operators from Alu();
Code:
int16_t Term (void)
{
   int16_t term = Alu();
   
   if (ERR_NONE == err)
   {
      while (IsMulOp(Look))
      {
         // there are only two possibilities
         if ('*' == Look)
         {
            Match ('*');
            term *= Alu();
         }
         else
         {
            Match ('/');
            term /= Alu();
         }
      }
   }
   return term;
}

It's all looking very similar; if we don't find anything interesting, we look for factors...
Code:
int16_t Alu (void)
{
   // bitwise logical operations

   int16_t alu = Factor();
   
   if (ERR_NONE == err)
   {
      while (IsLogOp(Look))
      {
         if ('&' == Look)
         {
            Match ('&');
            alu &= Factor();
         }
         else if ('|' == Look)
         {
            Match ('|');
            alu |= Factor();
         }
         else
         {
            Match ('^');
            alu ^= Factor();
         }
      }
   }
   return alu;
}

Finally we get to Factor(), where we both handle brackets by recursive calls to Expression(), and actual numbers of variables.
Code:
int16_t Factor (void)
{
   int16_t factor;
   
   if (Look == '(')
   {
      Match ('(');
      factor = Expression ();
      Match (')');
   }
   else if (isalpha(Look))
   {
      factor = vars[GetName() - 'A'];
   }
   else
   {
      factor = GetNum();
   }
   return factor;
}


Match() is interesting. It checks that it is already looking at the desired character and if it is, advances the pointer and reloads Look with the next character. This can throw an error if the character doesn't match.
Code:
void Match (char x)
{
   // match a specific input character, get the next character
   // and skip over any following white space
   if (x == Look)
   {
      GetChar();
      SkipWhite();
   }
   else
   {
      char buf[4];         // Expected needs a string
      sprintf(buf, "%c", x);
      Expected(buf);
   }
}

GetNum() and GetName() return either a multi-numeral integer, or a single letter.
Code:
char GetName (void)
{
   // validate and return a single character variable name
   // get an identifier
   char ret = ' ';
   if (!isalpha(Look))
   {
      Expected("Name");
   }
   ret = toupper(Look);
   GetChar();
   SkipWhite();
   return ret;
}

int16_t GetNum (void)
{
   // get a number
   int16_t ret = 0;
   if (!isdigit(Look))
   {
      Expected("Integer");
   }
   while (isdigit(Look))
   {
      ret *= 10;
      ret += Look - '0';
      GetChar();
   }
   SkipWhite();
   return ret;
}


Neil


Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 04, 2024 8:50 pm 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
Looks like you're lumping a lot of logical operators together. I don't know how various BASIC's handle it, but with C there is an order to the rules:

From most precedent to least in C:
primary: variables, constants, function calls, stuff in parens
unary: ++, --, sizeof, !, ~, etc.
cast: (as it says on the tin)
multiply: *, /, %
add: +, -
shift: <<, >>
relational: <, >, <=, >=
equality: ==, !=
and: &
xor: ^
or: |
logical_and: &&
logical_or: ||

If I recall correctly, many BASICs use ^ as an exponent operator, so perhaps like so:
primary
unary, +, -, NOT
exponent: ^
multiply: *, /, MOD
add: +, -
relational: <, >, <=, >=
equality: =, <>
logical_and: AND
logical_or: OR


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 05, 2024 6:53 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
Yes, I'm sadly aware of C's operator precedence, but I choose to disregard it :mrgreen: - half of it doesn't apply anyway, and no-one who is not trying to obfuscate C code (or being very clever) uses the logic operators in non-obvious ways without also using brackets.

Remember that this is for a Tiny Basic with only integer arithmetic and no mathematics, so exponent isn't a factor (or equally so if you used it for e.g. an assembler) and I think the relational/comparison/assignment operators will find their way in later in development, as they're needed. Dr Crenshaw's method is impressively simple and powerful once you get your head around it.

I have chosen the logic operators as highest priority, but one can move the precedence around very easily (e.g. Expression would look for logic operators and expect a new 'Addition' routine that expected Add operators, which would itself expect Mult operators, and Mult would look for factors (Alu() would disappear, but the precedence would then have logic below addition below multiplication.)

Neil

p.s. a second thought about pluseq and its friends: to be more generic in other uses, rather than writing back to the caller's stack, it should just return the result in Y:A and let the caller write it. It still needs access to read the same value, but rather than it snooping on the stack that would be a more obvious 'push parameter B, put parameter A in Y:A, call the routine, and save the result wherever'. In this case, it's an optimisation to use the local variable as the B parameter without explicitly pushing it.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 08, 2024 8:22 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
It's getting further... I can write code lines, insert out of order, and replace/delete existing lines. I can list. I can print in various combinations.
Code:
Tiny Basic 65c02
> 10 for q = 1 to 10
> 20 print "hello world"
> 30 next q
> list
10 for q = 1 to 10
20 print "hello world"
30 next q
>

It can print in quoted text, the (single) string, indexes into the string (zero based), and expressions including variables.
Code:
> print "hello"
hello
> print $
The quick brown fox jumps over the lazy dog
> print $[4]
q
> print (2*2*2*4*355) / (3 * 113)
33
>

And can take multiple things to print per line, using a ';' to inhibit the otherwise automatic newline.
Code:
> print q; "hello "; r
16hello 17
> print q; " hello "; r;
16 hello 17>

(note the position of the last >)

All I need to do now is implement the control sequences... If then endif, do while, for to, goto and gosub.
Oh, and rewrite it all in 65c02 :mrgreen:

A small matter of programming, no doubt.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 11, 2024 6:46 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
We're getting there... assignments to variable and the string are working.
Code:
barnacle@barnacle-Latitude-7480:~/Documents/Tiny basic$ ./tiny
Tiny Basic 65c02
> print $
The quick brown fox jumps over the lazy dog
> let $ = "Hello world"
> print $
Hello world
> print a b c
0
1
2
> let a = 123
> print abc
123
1
2
>


Neil


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 13, 2024 4:23 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
Now here's some odd sorting...
Code:
Neolithic Tiny Basic 65c02
> 1000 print
> 1010 for q = 1 to 10
> 1020 getting close
> 1023 closer...
> 1024 broken!
> list
 1024 broken!
 1000 print
 1010 for q = 1 to 10
 1020   getting close
 1023   closer...
> 10 print
> list
   10 print
 1024 broken!
 1000 print
 1010 for q = 1 to 10
 1020   getting close
 1023   closer...

(lines with > are entered at the terminal)

Thinking... :D

Neil


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 14, 2024 9:25 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 983
Location: Potsdam, DE
Ah, sorted it. That was a rat's nest... a combination of pointer abuse (casting the first two bytes in a C char buffer to a uint16_t) and type promotion was for some reason returning values above 127 as negative and then casting them to (the wrong) positive values... still no idea of the specific issue as the process works fine elsewhere, but it's fixed now.

So I can write in any order and list:
Code:
> list
  100 print "Mandelbrot - Neo Tiny Basic"
  110 print "Start"
  130 $ = ".,=+:%&$0XB# "
  140 F = 50
  150 for y = -12 to 12
  160   for x = -49 to 25
  170     c = x * 229 / 100
  180     d = y * 416 / 100
  190     a = c
  191     b = d
  192     i = 0
  200     q = b/f
  201     s = b - (q * f)
  210     t = ((a * a)-(b * b)) / f + c
  220     b = 2 * ((a * q) + (a * s / f)) + d
  230     a = t
  231     p = a / f
  232     q = b / f
  240     if ((p * p) + (q * q)) >= 5
  241       goto 280
  242       endif
  250     i = i + 1
  251     if i < 16
  252       goto 200
  253       endif
  260     print " ";
  270     goto 290
  280     print $[i + 1]
  290     next
  300   print
  310   next
  320 print "Finished"

Though I still can't execute anything yet :mrgreen:

Neil

edit: the indentation is automatic as part of the listing. Leading spaces in a line are silently eaten.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 103 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 7  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 28 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: