Posts

Multi-threading in Python

Process: A process is an instance of a computer program that is being executed. Any process has 3 basic components: An executable program. The associated data needed by the program (variables, work space, buffers, etc.) The execution context of the program (State of process) Thread: A thread is an entity within a process that can be scheduled for execution.  Also, it is the smallest unit of processing that can be performed in an OS. A thread is a sequence of such instructions within a program that can be executed independently of other code.  For simplicity, you can assume that a thread is simply a subset of a process! A thread contains all this information in a Thread Control Block (TCB): Thread Identifier: Unique id (TID) is assigned to every new thread Stack pointer: Points to thread’s stack in the process. Stack contains the local variables under thread’s scope. Program counter: a register which stores the address of the instruction currently being executed by thread. Thread state:

Regular Expressions in Python

Regular Expressions:  A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression. Regular expressions can contain both special and ordinary characters. Ordinary Characters: Most ordinary characters, like  'A' ,  'a' , or  '0' , are the simplest regular expressions; they simply match themselves. Special Characters: Some characters, like  '|'  or  '(' , are special. Special characters either stand for classes of ordinary characters or affect how the regular expressions around them are interpreted. The special characters are:: a)   ' . '(Dot):   this matches any character except a newline b)   '^':  (Caret.) Matches the start of the string c)  '$':  Matches the end of the string or just before the newline at the end of the string d)   '*':  It causes the resulting RE to match 0 or m

Ensemble Methods

Ensemble methods  is a machine learning technique that combines several base models in order to produce one optimal predictive model. An ensemble is itself a supervised learning algorithm, because it can be trained and then used to make predictions.  Ensembles tend to yield better results when there is a significant diversity among the models. Ensemble techniques (especially bagging) tend to reduce problems related to over-fitting of the training data. Ensembling reduces variance and bias, two things that can cause big differences between predicted and actual results. Types of ensembles: 1)  Bayes optimal classifier 2) Bagging 3) Boosting 4) Bayesian parameter averaging 5) Bayesian model combination 6) Bucket of models 7) Stacking 1)  Bayes optimal classifier (or)  Optimal Bayes classifier: The Optimal Bayes classifier chooses the class that has greatest a posteriori probability of occurrence (so called  maximum a posteriori estimat

What is Navie Bayes Theorem?

Navie Bayes theorem: Navie Bayes is a supervised classification algorithm. Which is based on bayes theorem with the assumption of independence among features, inorder to predict an categorie of an sample. Naive Bayes is a simple technique for constructing classifiers. It states that any freature in a class is independent of any other feature in the class. The main assumption in navie bayes theorem is the features are independent and every feature has equal importance in a class. They are probabilistic classifiers, therefore will calculate the probability of each category using Bayes theorem, and the category with the highest probability will be output. Multinomial Naive Bayes: Feature vectors represent the frequencies with which certain events have been generated by a multinomial distribution. This is the event model typically used for document classification. Bernoulli Naive Bayes: In the multivariate Bernoulli event model, features are independent booleans (binary va