My journey to Python

It starts even before one can think. So, i was in my first semester of my Bachelor’s in Engineering and it was December of 2004. I was back at my home place, Jaipur for vacations. There i met my school friend Himanshu Bhojwani, who made some real cool robotics and other electronics gadgets and he showed me them. I was quite curious and asked him how did he do all that stuf. Technology and gadgets had always interested me. So, he told me about Parallel Port programming and about Visual Basic 6.0. He said that he did all this by connecting his electronics stuff to parallel port of his computer and use Visual Basic to get his stuff done. That evening while returning back home, i purchased Parallel Port Programming. My initial plan was to use C++ to do the programming part, but looking at the ease VB6.0 could make same programs in much less time, i decided to learn VB6.0. I also got a neighbor of mine, who was a Software Engineer at that time, to guide me out with VB. Well, as time went on, i forgot the Parallel port programming, and was making small applications in VB6.0. This language opened a complete new world for me. From C++ to VB 6.0.

It dates back to the Jan of 2006. I was a fourth semester student student at Bharati Vidyapeeth University, Pune and those were the days of college tech-fest, Bharatiyam. I was in the college technical team, softwares division, supervised by Hemant Sahni, who was technical head at that time. He is real good when it comes to application software development. I was initially working with the networking team, but when he found out that i had good VB skills, he asked me to help him, as he was using VB and SQL Server 2000 for this application he was designing. We were making a software to manage event scheduling, registrations, accounts and other details for the techfest. At the same time, another technical head of our University, Siddharth Upmanyu was working on a online c/c++ compiler. (Yea, the same which was hosted earlier on this blog). Somehow i was quite interested in this “online compiler” business. So, i went to him, and asked ‘Sir, how have you made this cool stuff?’. And he gave me a detailed information about the underlying architecture and flow, most of which bounced off my brains. So i went ahead with another question, ‘Which language you used to code all this stuff?’. ‘Perl’, came the reply. I had no clue there was a programming language “Perl”. What kind of a name is this, i was thinking. And then i said, “Perl?”. Judging my expressions, he said “its a language similar to python in some aspects”. And i was even more confused. How can anyone name a language as “Python”. Obviously i had no clue what Python was. But I decided not to make another fool of myself, so i said “Oh Python! Its a cool language” and hurried back home. LoL

After i came back, i did a quick Google for Perl and Python, and decided to learn one of the language. Perl looked a little complex to me, so i went on for Python, and haven’t looked back since then. I used Python for almost everything i had to do programatically!! I had always been in love with Python.

After graduation, I joined Harbinger-Systems, Pune and was working on ColdFusion MX7 and SQl Server 2000. I used Python for some small programs and bot, which helped me in my day to day job. After that i am currently working with Oxylabs Networks, where I am working in a language i love, Python 🙂

This is in short, my love story with Python, a tale of inspiration, aspiration and willingness.

OpenSource alternate to Dreamweaver

Of lately, i was thinking of switching to open source alternative for DreamWeaver, and i found out Aptana. Aptana is an eclipse based IDE that provides a lot of benefits which but obviously includes being the powers of opensource and freeware. Aptana comes in two flavors – a stand alone IDE and an Eclipse Plugin. You can install either of the two depending on your needs. Personally i dont like Ecipse, so i decided to go for stand alone IDE. You might want to check out the download page for Aptana

After installing Aptana, you need to install CFEclipse plugin for providing ColdFusion support in Aptana.

Installing CFEclipse in Aptana

Download the latest version from the archived software update site: http://www.cfeclipse.org/update/
Unzip the archive file. You should get a folder called “org.cfeclipse.cfml.update.release”.
In Aptana, go through the usual procedure to add an update site (Help -> Software Updates -> Find and Install -> Search for new features to install).
Click on New Local Site…
Choose the folder that you extracted from the zip file (e.g. the location of “org.cfeclipse.cfml.update.release”).
Click Select.
Change the name to something more reasonable such as CFEclipse Local Install.
Click OK and then click Finish.
Follow the installation procedure and prompts… (I am sure you can handle it from now on).

Pretty Simple. That’s it. And now we have an open source alternative for DreamWeaver that is much better than the original one.

CFML Parser in Python Part 2

This is an extension to my earlier post regarding the same issue. A coldfusion parser in python, which is capable of making the CF code resistant to SQL Injection attacks. The code mentioned below is far more improved then the previous one.

#!/usr/bin/env python



#"""
#    A script to make sure all CFM Files have the CFQueryParam validation tags
#    
#    Copyright (C) 2008-09  Pranav Prakash pranav@myblive.com
#    
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see http://www.gnu.org/licenses/
# 
#"""


__author__ = 'Pranav Prakash'
__author_email__ = 'pranny@gmail.com'

import sgmllib, re, string
import sys

#"""
#   The CFMLParser class, is used to parse the CFM/ CFC files so as to extract all the CFQuery tags.
#   These tags contain the SQL queries which are vulnurable to be attacked.
#"""

class CFMLParser(sgmllib.SGMLParser):
def __init__(self):
sgmllib.SGMLParser.__init__(self)
self.inside_cfquery = False
self.inside_comment = False
self.inside_logic = False

self.SQL_queries = []
self.query_names = []
self.newSQL_queries = []
self.temp = ''

def start_cfquery(self, attributes):
self.inside_cfquery = True
for k,v in attributes:
if k == 'name':
self.query_names.append(v)
if k == 'sql':
self.query_names.pop()
self.end_cfquery()

def end_cfquery(self):
if self.temp != '':
self.SQL_queries.append(self.temp)
self.temp = ''
self.inside_cfquery = False

def start_cfqueryparam(self, attributes):
self.temp += self.get_starttag_text()
#print self.temp

def end_cfqueryparam(self):
pass

def start_cfif(self, attributes):
if self.inside_cfquery:
self.temp += self.get_starttag_text()
self.inside_logic = True
else:
self.end_cfif()

def end_cfif(self):
if self.inside_cfquery:
self.temp += ''
self.inside_logic = False
else:
pass

def start_cfelse(self, attributes):
if self.inside_logic:
self.temp += self.get_starttag_text()

def start_cfelseif(self, attributes):
if self.inside_logic:
self.temp += self.get_starttag_text()

def handle_data(self, data):
if self.inside_logic or self.inside_cfquery and len(data.lstrip()) > 0:
self.temp += data

def handle_comment(self, comment):
if self.inside_cfquery:
self.temp += ''

def report_ubalanced(self, tag):
if tag == 'cfqueryparam':
end_cfqueryparam()
if tag == 'cfelse':
end_cfelse()

def get_queries(self):
return self.SQL_queries

def get_oldqueries(self):
return self.SQL_queries

def get_newqueries(self):
return self.newSQL_queries

def ScanQueries(self):
for query in self.SQL_queries:
self.ScanSingleQuery(query)

def ScanSingleQuery(self, query):
TrainMan = SQLValidator(query)
TrainMan.Validate_SQL()
self.newSQL_queries.append(TrainMan.get_ValidatedSQL())

class SQLValidator():
def __init__(self, sql):
self.raw_sql = sql
self.refinedSQL = ''


# Mentioned below are the four regular expressions that cover 'most' of the DML syntax parameters
# Most because i am not very strong with SQL :-)
# Before using CFUnvalidatedGroupingRE, we need to make sure that a IN or VALUES has been used, else
# it goes on for stored procedures too, and that is pretty evil       

self.variableRE = "'?#\w+\.?\w+\(?\w*\)?#'?(?![_|\w])"
self.CFUnvalidatedAssignmentRE = "[\s*|\,]\w+\.?\w+\s=\s'?#\w+\.?\w+\(?\w*\)?#(?![_|\w])'?"
self.CFUnvalidatedGroupingRE = "[\(|\,]\s*'?#\w+\.?\w+\(?\w*\)?#'?(?![_|\w])"
self.CFUnvalidatedInGroupingRE = re.compile("\sIN\s+\(\s*'?#\w+\.?\w+\(?\w*\)?#'?(?![_|\w])", re.IGNORECASE)


self.TypeMap = dict({'vch': 'CF_SQL_VARCHAR',
'bit': 'CF_SQL_BIT',
'time': 'CF_SQL_TIME', 
'int': 'CF_SQL_INTEGER',
'now': 'CF_SQL_TIMESTAMP',
'dat': 'CF_SQL_TIMESTAMP',
})


def Validate_SQL(self):
'picks up raw SQL and creates refined SQL'
self.refinedSQL = re.sub(self.CFUnvalidatedAssignmentRE, self.handleUnvalidatedAssignment, self.raw_sql)
self.refinedSQL = re.sub(self.CFUnvalidatedInGroupingRE, self.handleCFUnvalidatedIn, self.refinedSQL)
self.refinedSQL = re.sub(self.CFUnvalidatedGroupingRE, self.handleUnvalidatedGrouping, self.refinedSQL)



def handleUnvalidatedAssignment(self, s):
tag = s.group(0)
tokenPattern = re.compile("#\w+\.?\w+\(?\w*\)?#")
LHSpattern = re.compile("[\s*|\,]\w+\.?\w+\s=\s")
varcharPattern = re.compile("'#\w+\.?\w+\(?\w*\)?#'")

token = re.findall(tokenPattern, tag)[0]
predecessor = re.findall(LHSpattern, tag)[0]

if len(re.findall(varcharPattern, tag)) != 0:
newToken = self.handleIndividualTokens(token, predecessor, 'CF_SQL_VARCHAR')
else:
newToken = self.handleIndividualTokens(token, predecessor)

finalValue = predecessor + newToken
return finalValue


def handleCFUnvalidatedIn(self, s):
tag = s.group(0)

tokenPattern = re.compile("#\w+\.?\w+\(?\w*\)?#")
lhsPattern = re.compile("\sIN\s+\(\s*", re.IGNORECASE)
varcharPattern = re.compile("'#\w+\.?\w+\(?\w*\)?#'")

token = re.findall(tokenPattern, tag)[0]
predecessor = re.findall(lhsPattern, tag)[0]

if len(re.findall(varcharPattern, tag)) != 0:
newToken = ''
else:
newToken = ''

finalValue = predecessor + newToken
return finalValue


def handleUnvalidatedGrouping(self, s):
tag = s.group(0)

tokenPattern = re.compile("#\w+\.?\w+\(?\w*\)?#")
lhsPattern = re.compile("[\(|\,]\s*")
varcharPattern = re.compile("'#\w+\.?\w+\(?\w*\)?#'")

token = re.findall(tokenPattern, tag)[0]
predecessor = re.findall(lhsPattern, tag)[0]

if re.findall('IN|in|VALUES|values', self.refinedSQL) == []:
return tag
else:
if re.findall(varcharPattern, tag) != []:
newToken = self.handleIndividualTokens(token, predecessor, 'CF_SQL_VARCHAR')
else:
newToken = self.handleIndividualTokens(token, predecessor)
finalValue = predecessor + newToken
return finalValue


def get_ValidatedSQL(self):
return self.refinedSQL

def handleIndividualTokens(self, token, predecessor, hint=''):
if hint == '':
return ''
else:
return ''

def findDataType(self, rvalue, lvalue='', hint=''):
lvalue = rvalue.split('.')[0]
try:
rvalue = rvalue.split('.')[1]        
except:
rvalue = None

if string.find(lvalue.lower(), '#now()#') != -1:
return 'CF_SQL_TIMESTAMP'

if rvalue is not None:
for k in self.TypeMap.keys():
if rvalue.lower().startswith(k):
return self.TypeMap[k]
else:
for k in self.TypeMap.keys():
if lvalue.lower().startswith(k):
return self.TypeMap[k]
return 'CF_SQL_INTEGER'        


def ScanAndReplace(o, n, fullText):
"""
This function replaces all the old sql queries in a document with the new sql
o = list of old queries
n = list of new queries
fullText = full text of the file, where in the operaration is supposed to be performed
"""
#fullText = fullText.decode('string_escape')
if len(o) == len(n):
for i in range(0,len(o)):
fullText = fullText.replace( o[i], n[i])
#if fullText.find(o[i]) == -1:
#    print o[i]
#    print '----------------'
#    print n[i]
return fullText

def check_a_file(infilename):
f = open(infilename)
text = f.read()
f.close()

myCFMLParser = CFMLParser()
myCFMLParser.feed(text)
myCFMLParser.close()
myCFMLParser.ScanQueries()

o = myCFMLParser.get_oldqueries()
n = myCFMLParser.get_newqueries()

text = ScanAndReplace(o, n, text)
f = open('o.cfm', 'w+')

for oi in o:
f.write(str(oi))
f.close()

f = open('n.cfm', 'w+')
f.write(text)

f.close()



if __name__ == '__main__':
check_a_file('LMSCoursecreator.cfm')

Hope you all find it useful.

ColdFusion Markup Language (CFM) Parser in Python

This is a simple ColdFusion Markup Language (CFM, CFML, CFC) parser written in Python. The Parser aims at finding out the places where tCFQueryParam Validations have been missing and corrects them.

In ColdFusion, whenever we have to place any variable into the SQL Statements, which are inside CFQuery Tags, we must use the CFQueryParam tags, to make sure they are protected from SQL Injections. However, there are many a cases, when these tags have been missed (knowingly or unknowingly) by the developer, and then at a later stage, applying the CFQueryParam tags to all of them is a very tedious job. So, i came up with this script that does this job for you. The project is located at http://code.google.com/p/cfml-sqlvc/. I have also put my very basic script here, in order to help those who are looking for a similar kind of script. You can always go to the project home page and download the latest version, with bug fixes and many more features.

It scans a particular folder, creates a list of files that need CFQueryParam Validation tags, and then applies them. IT DOES NOT change the original file. It just tells where changes are needed and then displays the new SQL Statement that should be present. So, it leaves scope for manual work. Guys, relax, you won’t be fired 🙂

And for those people, who want everything to be done by this program or may be report issues, you can visit the project’s homepage at http://code.google.com/p/cfml-sqlvc/


#!/usr/bin/env python

##"""
##    A script to make sure all CFM Files have the CFQueryParam validation tags
##    
##    Copyright (C) 2008  Pranav Prakash pranny@gmail.com
##    
##    This program is free software: you can redistribute it and/or modify
##    it under the terms of the GNU General Public License as published by
##    the Free Software Foundation, either version 3 of the License, or
##    (at your option) any later version.
##
##    This program is distributed in the hope that it will be useful,
##    but WITHOUT ANY WARRANTY; without even the implied warranty of
##    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
##    GNU General Public License for more details.
##
##    You should have received a copy of the GNU General Public License
##    along with this program.  If not, see http://www.gnu.org/licenses/
## 
##"""

import sgmllib, re

class CFMLParser(sgmllib.SGMLParser):
 def __init__(self, verbose=0):
  sgmllib.SGMLParser.__init__(self, verbose)
  self.insideCFQuery = False
  self.insideComment = False
  self.insideLogic = False
  self.SQLQueries = []
  self.QueryNames = []
  self.NewSQLQueries = []
  self.tempQuery = ''
  self.unvalidatedPattern = re.compile("\s\w+\.?\w+\s=\s'?#\w+\.?\w+\(?\w*\)?#'?")
  self.varcharPattern = re.compile("'#\w+\.?\w+\(?\w*\)?#'")
  self.nonVarcharpattern  = self.token = re.compile("#\w+\.?\w+\(?\w*\)?#")
  self.lhs = re.compile("\s\w+\.?\w+\s=\s")
  self.map = dict({'bit':'CF_SQL_BIT',
     'dat':'CF_SQL_DATE',
     'time':'CF_SQL_TIME',
     'timeStamp':'CF_SQL_TIMESTAMP',
     'int':'CF_SQL_INTEGER',
     'vch':'CF_SQL_VARCHAR'})
  
  
 def start_cfquery(self, attributes):
  self.insideCFQuery = True
  for k,v in attributes:
   if k == 'name':
    self.QueryNames.append(v)
    
 def end_cfquery(self):
  self.insideCFQuery = False
  if self.tempQuery != '':
   self.SQLQueries.append(self.tempQuery)
  self.tempQuery = ''
 
 def start_cfqueryparam(self, attributes):
  self.tempQuery += self.get_starttag_text()
 
 def end_cfqueryparam(self):
  pass
  
 def start_cfif(self, attributes):
  if self.insideCFQuery:
   self.tempQuery += self.get_starttag_text()
   self.insideLogic = True
 
 def end_cfif(self):
  if self.insideLogic:
   self.tempQuery += ''
   self.insideLogic = False
  
 def start_cfelse(self, attributes):
  if self.insideLogic:
   self.tempQuery += self.get_starttag_text()
  
 def handle_data(self, data):
  if self.insideLogic or self.insideCFQuery and len(data.lstrip()) > 0:
   self.tempQuery += data
 
 def handle_comment(self, comment):
  if self.insideCFQuery:
   self.tempQuery += ''

 def report_unbalanced(tag):
  if tag == 'cfqueryparam':
   end_cfqueryparam()
  if tag == 'cfelse':
   end_cfelse()
 
 def get_QueryNames(self):
  return self.QueryNames
  
 def get_OldSQLQueries(self):
  return self.SQLQueries

 def get_NewSQLQueries(self):
  return self.NewSQLQueries
  
 def ScanQuery(self, query):
  self.NewSQLQueries.append(re.sub(self.unvalidatedPattern, self.handleIndividualTokens, query))


 def findDataType(self, lvalue, rvalue):
  for k in self.map:
   p = k+'\w+'
   pa = re.compile(p)
   l = pa.findall(rvalue)
   if l != []:
    return self.map.get(k)
  for k in self.map:
   p = '\.?'+k+'\w+'
   pa = re.compile(p)
   l = pa.findall(lvalue)
   if l != []:
    return self.map.get(k)
  return 'CF_SQL_INTEGER'
  
 
 def handleIndividualTokens(self, s):
  
  tag = s.group(0)
  m = self.varcharPattern.findall(tag)
  if len(m) > 0:
   rhsValue = self.token.findall(m[0])[0]
   lhsValue = self.lhs.findall(tag)[0]
   finalVal = lhsValue + ''
   return finalVal
  else:
   lhsValue = self.lhs.findall(tag)[0]
   rhsValue = self.nonVarcharpattern.findall(tag)[0]
   finalVal = lhsValue + ''
   return finalVal
 
 def ScanQueries(self):
  for SQL in self.SQLQueries:
   self.ScanQuery(SQL)

def ScanAndReplace(text):
 myCFMLParser = CFMLParser()
 myCFMLParser.feed(text)
 myCFMLParser.close()
 myCFMLParser.ScanQueries()
 o = myCFMLParser.get_OldSQLQueries()
 n = myCFMLParser.get_NewSQLQueries()
 for i in (0, len(n)-1):
  text = text.replace(o[i], n[i])
 return text

 
if __name__ == '__main__':
 inFile = '/home/pranav/projects/cfmparser/test.cfm'
 f = open(inFile, 'r')
 FileContentText = f.read()
 f.close()
 
 print ScanAndReplace(FileContentText)

Don’t forget top check the latest development at http://code.google.com/p/cfml-sqlvc/.

Pranav Prakash…

Hello,
This page is an introduction about me…

Personal Information

Name: Pranav Prakash
DoB: December 20
Occupation: Engineer, Software
University: Bharati Vidyapeeth University, Pune
Specialization: Data Structures, Algorithms, Game Design, eLearning Systems
Current Employeer: Oxylabs Networks, India
Past Employer(s) : Harbinger-Systems, India
Languages: Python, ColdFusion, C++
Frameworks: Google App Engine, Fusebox, jQuery, Django
Groups: Pune GTUG Member

External Links

Where Am I?