Monday, October 10th, 2011
Let’s consider the following database table:
| Date | Id | Amount |
| 29.10.2011 | 1001 | 10.2 |
| 30.10.2011 | 1001 | 15.6 |
| 29.10.2011 | 1002 | 17.5 |
| 30.10.2011 | 1002 | 12.2 |
| 31.10.2011 | 1002 | 7.9 |
| 1.11.2011 | 1001 | 19.3 |
| 1.11.2011 | 1002 | 17.9 |
Suppose that from this table we want to generate the following MATLAB matrix storing amount information.
| 10.2 | 15.6 | NaN | 19.3 |
| 17.5 | 12.2 | 7.9 | 17.9 |
To do that we need to map each date and ID with an index. We use containers.Map objects. See Matlab Map Containers page to learn more.
| 1001 | 1 |
| 1002 | 2 |
| 29.10.2011 | 1 |
| 30.10.2011 | 2 |
| 31.10.2011 | 3 |
| 1.11.2011 | 4 |
Next step is to fill matrix Amount. Consider a m x n matrix A. Suppose we want to use matrix B = (1,1;1 2) for selecting elements of A. Unfortunetely A(B(:,1), B(:,2)) will not work. We should use linear indexing. See Matrix indexing to learn more.
T1 = ['29.10.2011'; '30.10.2011'; '29.10.2011'; '30.10.2011'; '31.10.2011'; '1.11.2011'; '1.11.2011'];
T2 = [1001; 1001; 1002; 1002; 1002; 1001; 1002];
T3 = [10.2; 15.6; 17.5; 12.2; 7.9; 19.3; 17;9 ];
keySet = {'29.10.2011', '30.10.2011', '31.10.2011', '01.11.2011'};
valueSet = 1:4;
date_index = containers.Map(keySet,valueSet);
keySet = {1001, 1002};
valueSet = 1:2;
id_index = containers.Map(keySet,valueSet);
Amount = ones(2,4)*NaN;
Amount(sub2ind(size(Amount), values(date_index,T1), values(id_index,T2) ) ) = T3
You can also index dates by using Matlab’s datenum function. This function returns how many days is passed from a reference point (1-Jan-0000). Suppose D matrix contains a list of dates then the following code will also work:
D = {'29.10.2011', '30.10.2011', '29.10.2011', '30.10.2011', '31.10.2011', '01.11.2011'}
datenum(D) - min(datenum(D))
Tags: Matlab
Posted in General | No Comments »
Saturday, June 11th, 2011
In graph theory, degree of a given vertex is the number of vertices which are neighbour to this vertex. Let be a degree sequence then we can check whether it is possible to construct a graph such that the degrees of vertices follows this degree sequence. We call such a degree sequence graphical. The first simple check is to sum all the degree values. This number should even by hand-shakking lemma. This is a necessary condition but not sufficient. For example it is impossible to construct a graph with a degree sequence [3 3 3 1].
Erdős–Gallai theorem gives us necessary and sufficient conditions for this problem.
the sum of the sequence is even and
Suppose the degree sequence is graphical then the following simple procedure construct a possible graph implementation from this given degree sequence.

function E = mcmc_graph(degree_sequence)
n = size(degree_sequence,2);
% Check if the given degree sequence is graphical
degree_sequence_feasible = 1;
if mod(sum(degree_sequence),2)
degree_sequence_feasible = 0;
else
sorted_degree_sequence = sort(degree_sequence,'descend');
for k=1:n
temp_sum = 0;
for i=k+1:n
temp_sum = temp_sum + min(sorted_degree_sequence(i),k);
end
if (sum( sorted_degree_sequence(1:k)) > k*(k-1) + temp_sum )
degree_sequence_feasible = 0;
end
end
end
m = sum(degree_sequence)/2;
E = zeros(m,2);
% Construct an initial graph with Havel-Hakini
if degree_sequence_feasible
ds = degree_sequence;
i = 0;
while sum(ds) > 0
[residual_degree vertices] = sort(ds,'descend');
selected_vertices = vertices(2:residual_degree(1)+1);
E(i+1:i+residual_degree(1),:) = ...
[repmat(vertices(1),residual_degree(1)) selected_vertices'];
i = i + residual_degree(1);
ds(vertices(1)) = ds(vertices(1)) - residual_degree(1);
ds(selected_vertices) = ds(selected_vertices) - 1;
end
Tags: Graph Theory, Matlab
Posted in General | No Comments »
Saturday, March 13th, 2010
I am not using MS Word since I started my master degree; all of the documents that I produced is done in LaTeX. If you are also using LaTeX for your homeworks or you will start writing your thesis soon, I recommend you to visit Ted’s latex template page. Probably the most exciting feature of this template is easiness of including matlab scripts, plus it looks fancy!
Tags: latex, Matlab, template
Posted in PHD | No Comments »
Friday, February 19th, 2010
Missing values in a data set can be occured from several reasons:
One can try to ignore those missing entries however this approach may be misleading especially for small data sets. Sample containing missing values also hold some value and the information can be easily recovered by imputation. The following site contains useful resources, tutorials and links. In this study, I tried to compare different techniques for the imputation process, corresponding matlab files.
Basically there are three methods:
In this approach missing fiels are filled with a number generated according to a probability distribition and the parameters of this distribition can be determined with the rest of that column which are available.
One can also think filling missing value problem as a regression problem. In this case, column containing missing values should be considered as a output values and other columns are dependant variables. Missing values can be estimated with this regression model.
In this modern approach, not only one filled tableau is generated but several and then we take the average of them. If we knew the missing values, then estimating the model parameters would be
straightforward. Similarly, if we knew parameters of the data model, then it would be
possible to obtain unbiased prediction for the missing values. An iterative method can
be used: first predict the missing values based on assumed values for the parameters,
use these predictions to update the parameter estimates, and repeat. The sequence
of parameters converges to maximum-likelihood estimates.
Tags: Classification, Matlab, Pattern Recognition
Posted in Optimization | 2 Comments »
Friday, February 12th, 2010
Traveling Salesman Problem also known as TSP can be defined as follows: Given a number of cities and the costs of travelling from any city to any other city, what is the least-cost round-trip route that visits each city exactly once and then returns to the starting city. This is a good book about TSP. You can find TPS instances in this site. These instances can be used in order to benchmark algorithms. In this report some contructive heuristic methods (nearest neighbor, arbitrary insertion, farthest insertion) as well as a improvement heuristic (2-opt) are compared. Corresponding matlab files are also available.
Tags: Heuristic Methods, Matlab, Network Models
Posted in Optimization | No Comments »