Logistic regression in Scilab

Let’s create some random data that are split into two different classes, ‘class 0’ and ‘class 1’.

We will use these data as a training set for logistic regression.

lreg01


b0 = 10;
t = b0 * rand(100,2);
t = [t 0.5+0.5*sign(t(:,2)+t(:,1)-b0)];

b = 1;
flip = find(abs(t(:,2)+t(:,1)-b0)<b);
t(flip,$)=grand(length(t(flip,$)),1,"uin",0,1);

t0 = t(find(t(:,$)==0),:);
t1 = t(find(t(:,$)==1),:);

clf(0);scf(0);
plot(t0(:,1),t0(:,2),'bo')
plot(t1(:,1),t1(:,2),'rx')

The data from different classes overlap slightly. The degree of overlapping is controlled by the parameter b in the code.

We want to build a classification model that estimates the probability that a new, incoming data belong to the class 1.

First, we separate the data into features and results:


x = t(:, 1:$-1); y = t(:, $);

[m, n] = size(x);

Then, we add the intercept column to the feature matrix


// Add intercept term to x
x = [ones(m, 1) x];

The logistic regression hypothesis is defined as:

h(θ, x) = 1 / (1 + exp(−θTx) )

It’s value is the probability that the data with the features x belong to the class 1.

The cost function in logistic regression is

J = [−yT log(h) − (1−y)T log(1−h)]/m

where log is the “element-wise” logarithm, not a matrix logarithm.

If we use the gradient descent algorithm, then the update rule for the θ is

θθαJ = θα xT (hy) / m

The code is as follows


// Initialize fitting parameters
theta = zeros(n + 1, 1);

// Learning rate and number of iterations

a = 0.01;
n_iter = 10000;

for iter = 1:n_iter do
    z = x * theta;
    h = ones(z) ./ (1+exp(-z));
    theta = theta - a * x' *(h-y) / m;
    J(iter) = (-y' * log(h) - (1-y)' * log(1-h))/m;
end

Now, the classification can be visualized:


// Display the result

disp(theta)

u = linspace(min(x(:,2)),max(x(:,2)));

clf(1);scf(1);
plot(t0(:,1),t0(:,2),'bo')
plot(t1(:,1),t1(:,2),'rx')
plot(u,-(theta(1)+theta(2)*u)/theta(3),'-g')

lreg02

Looks good.

The graph of the cost at each iteration is:


// Plot the convergence graph

clf(2);scf(2);
plot(1:n_iter, J');
xtitle('Convergence','Iterations','Cost')

lreg03

Advertisements
Published in: on 04/08/2015 at 18:47  Leave a Comment  
Tags:

The URI to TrackBack this entry is: https://burubaxair.wordpress.com/2015/08/04/logistic-regression-in-scilab/trackback/

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: