Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does Mathematica create a new Symbol?

Good day,

I thought earlier that Mathematica creates new symbols in the current $Context at the stage of converting of the input string (that is assigned to InString) to input expression (that is assigned to In). But one simple example has broken this explanation:

In[1]:= ?f
During evaluation of In[1]:= Information::notfound: Symbol f not found. >>
In[2]:= Names["`*"]
Out[2]= {}
In[3]:= DownValues[In]//First
InString[1]
Names["`*"]
Out[3]= HoldPattern[In[1]]:>Information[f,LongForm->False]
Out[4]= \(? f\)
Out[5]= {}

You can see that there is no symbol f in the $ContextPath although it is already used inside definition for In[1].

This example shows that it is in principle possible in Mathematica to make definitions with symbols that do not exist in the $ContextPath without creating them. This could be interesting alternative to the method of avoiding symbol creation using Symbol:

In[9]:= ff := Symbol["f"]
Names["`*"]
Out[10]= {"ff"}

Can anybody explain at which conditions and at which stage of the evaluation process Mathematica creates new Symbols?

EDIT

As Sasha have noticed in the comment to this question, in really I was spoofed by default ShowStringCharacters->False settings for the Output cells in the default stylesheet Core.nb and missed the FullForm of the output for DownValues[In]//First. In really symbol f is not used in the definition for In[1] as we can see also by using InputForm:

In[1]:= ?f
DownValues[In]//First//InputForm
During evaluation of In[1]:= Information::notfound: Symbol f not found. >>
Out[2]//InputForm=
HoldPattern[In[1]] :> Information["f", LongForm -> False]

Sorry for hasty statement.

So the question now is just about the stage at which Mathematica decides to create new Symbol and how we can prevent it? For example, in the above example we input f as Symbol but Mathematica converts it to String without creating new symbol. This is built-in behavior of MakeExpression:

In[1]:= ?f
InputForm[MakeExpression[ToExpression@InString[1], StandardForm]]

During evaluation of In[1]:= Information::notfound: Symbol f not found. >>

Out[2]//InputForm=
HoldComplete[Information["f", LongForm -> False]]

Probably it is possible to define some type of syntactic construct that will prevent symbol creation until the evaluation time.

About stage of evaluation when new symbol is created

We can see that incrementing $Line happens before calling MakeExpression but new Symbol creation and assigning new value for InString and In variables happens after calling MakeExpression:

In[1]:= MakeExpression[My`boxes_,My`f_]/;!TrueQ[My`$InsideMakeExpression]:=Block[{My`$InsideMakeExpression=True},Print[$Line];Print[DownValues[InString][[All,1]]];Print[DownValues[In][[All,1]]];Print[Names["`*"]];MakeExpression[My`boxes,My`f]];
In[2]:= a
During evaluation of In[2]:= 2
During evaluation of In[2]:= {HoldPattern[InString[1]]}
During evaluation of In[2]:= {HoldPattern[In[1]]}
During evaluation of In[2]:= {}
Out[2]= a

The same we can say about $PreRead and $NewSymbol call time:

In[1]:= $NewSymbol:=Print["Names[\"`*\"]=",Names["`*"],"\nDownValues[InString]=",DownValues[InString][[All,1]],"\nDownValues[In]=",DownValues[In][[All,1]],"\nName: ",#1,"\tContext: ",#2]&
In[2]:= a
During evaluation of In[2]:= Names["`*"]={}
DownValues[InString]={HoldPattern[InString[1]]}
DownValues[In]={HoldPattern[In[1]]}
Name: a Context: Global`
Out[2]= a

$Pre executes after new assignment to In is made and after creating all new Symbols in the current $Context:

In[1]:= $Pre := (Print[Names["`*"]]; 
   Print[DownValues[In][[All, 1]]]; ##) &

In[2]:= a

During evaluation of In[2]:= {a}

During evaluation of In[2]:= {HoldPattern[In[1]],HoldPattern[In[2]]}

Out[2]= a

It seems that it is not possible to intercept assigning new value for In variable.


The conclusion: new Symbols are created after calling $PreRead, MakeExpression and $NewSymbol but before calling $Pre.

like image 249
Alexey Popkov Avatar asked Apr 11 '11 05:04

Alexey Popkov


3 Answers

Regarding your question in the edit part: not sure if this is what you had in mind , but in FrontEnd sessions you can use $PreRead to keep symbols as strings during the parsing stage. Here is one possible hack which does it:

symbolQ = StringMatchQ[#, RegularExpression["[a-zA-Z$][a-zA-Z$`0-9]*"]] &;

ClearAll[keepSymbolsAsStrings];
SetAttributes[keepSymbolsAsStrings, HoldAllComplete];

$PreRead  = # //. RowBox[{"keepSymbolsAsStrings", rest___}] :>
 RowBox[{"keepSymbolsAsStrings", 
   Sequence @@ ({rest} //. x_String?symbolQ :>
       With[{context = Quiet[Context[x]]},            
        StringJoin["\"", x, "\""] /; 
         Head[context] === Context])}] &;

The symbol will be converted to string only if it does not exist yet (which is checked via Context[symbol_string_name]). For example

In[4]:= keepSymbolsAsStrings[a+b*Sin[c]]//FullForm

Out[4]//FullForm= keepSymbolsAsStrings[Plus["a",Times["b",Sin["c"]]]]

It is important that the keepSymbolsAsStrings is defined first, so that this symbol is created. This makes it re-entrant:

In[6]:= keepSymbolsAsStrings[a+b*Sin[c]*keepSymbolsAsStrings[d+e*Sin[f]]]//FullForm

Out[6]//FullForm= 
  keepSymbolsAsStrings[Plus["a",Times["b",Sin["c"],
  keepSymbolsAsStrings[Plus["d",Times["e",Sin["f"]]]]]]]

Now, you can handle these symbols (kept as strings) after your code has been parsed, in the way you like. You could also use a different symbolQ function - I just use a simple-minded one for the sake of example.

This won't work for packages though. I don't see a straightforward way to do this for packages. One simplistic approach would be to dynamically redefine Needs, to modify the source on the string level in a similar manner as a sort of a pre-processing stage, and effectively call Needs on the modified source. But string-level source modifications are generally fragile.

HTH

Edit

The above code has a flaw in that it is hard to distinguish which strings were meant to be strings and which were symbols converted by the above function. You can modify the code above by changing ClearAll[keepSymbolsAsStrings] to ClearAll[keepSymbolsAsStrings, symbol] and StringJoin["\"", x, "\""] by RowBox[{"symbol", "[", StringJoin["\"", x, "\""], "]"}] to keep track of which strings in the resulting expression correspond to converted symbols.

Edit 2

Here is the modified code, based on MakeExpression rather than $PreRead, as suggested by @Alexey:

symbolQ =  StringMatchQ[#, RegularExpression["[a-zA-Z$][a-zA-Z$0-9`]*"]] &;

ClearAll[keepSymbolsAsStrings, symbol];
SetAttributes[keepSymbolsAsStrings, HoldAllComplete];

Module[{tried},
 MakeExpression[RowBox[{"keepSymbolsAsStrings", rest___}], form_] :=
  Block[{tried = True},
    MakeExpression[
       RowBox[{"keepSymbolsAsStrings", 
         Sequence @@ ({rest} //. x_String?symbolQ :>
            With[{context = Quiet[Context[x]]},            
             RowBox[{"symbol", "[", StringJoin["\"", x, "\""], "]"}] /;
             Head[context] === Context])}], form]
  ] /;!TrueQ[tried]
]

We need the trick of Todd Gayley to break from an infinite recursion in definitions of MakeExpression. Here are the examples again:

In[7]:= keepSymbolsAsStrings[a+b*Sin[c]]//FullForm

Out[7]//FullForm= keepSymbolsAsStrings[Plus[symbol["a"],Times[symbol["b"],Sin[symbol["c"]]]]]

In[8]:= keepSymbolsAsStrings[a+b*Sin[c]*keepSymbolsAsStrings[d+e*Sin[f]]]//FullForm

Out[8]//FullForm=  keepSymbolsAsStrings[Plus[symbol["a"],Times[symbol["b"],Sin[symbol["c"]],
keepSymbolsAsStrings[Plus[symbol["d"],Times[symbol["e"],Sin[symbol["f"]]]]]]]]

This method is cleaner since $PreRead is still available to the end user.

like image 115
Leonid Shifrin Avatar answered Sep 22 '22 22:09

Leonid Shifrin


You can use $NewSymbol and $NewMessage to have a better understanding when the symbol is created. But from the virtual book, the symbol is created in $Context when it can be found in neither $Context nor $ContextPath.

like image 24
Sasha Avatar answered Sep 26 '22 22:09

Sasha


I think your basic understanding that symbols are created when the input is parsed into an expression is correct.

The subtle part is that ? at the beginning of a line (and << and >>) parse specially to allow strings without requiring quotation marks. (The implicit strings here are patterns like *Min* for ? and file names for << and >>.)

like image 24
Brett Champion Avatar answered Sep 23 '22 22:09

Brett Champion