Something About the Thing: 2014

Saturday, May 10, 2014

Day 31 : Final Algorithms.

Done.
This one sweet word explains the abrupt absence from documentation. To elaborate, I can say that I have the final algorithms designed, AND implemented.
The officially look like

and

Now, I am running simulations, and collecting results, and writing report.

I'll be back soon with some sample results. So Long.

Tuesday, April 22, 2014

Day 30 : cost function - design edit.

I realise now that my design for the cost function was incomplete.
A "cost" should be something with an inherent negativeness to it, this was missing. So, the design edit is, while the company shifts it's clusters, this should be known as "effort" on part of the company, and this effort should have a related "cost", which is directly relational with the amount of effort on the part of the company.

This would allow better simulation of the different "levels of user-involvement" as well as let us monitor effectiveness of different strategies in the long run.
I should implement these changes, and come back with more pretty graphics soon. So Long.

Sunday, April 20, 2014

Day 29 : Cost functions and Spatial clusters : first look.

Finally, A first look at the spatial presentation of the clusters.

The colormap was a bit tricky, but finally it worked, and now the agents change colors as they move towards the cluster centres, in small steps. Also, the cost function makes sure that the cluster centres themselves move toward the agents in even smaller steps.

The function to measure spread as a function of time will need some tweaking due to the recent modifications, and that should take up the better part of the next week.
So Long.

Wednesday, April 16, 2014

Day 28 : Cleanup!

So, some much needed code cleanup was overdue.
Finally, the belief is a function of number of connections of the agent, and normalised between 0 and 1. A quite easy, but crucial modification to the previous model. The new threshold is set at 0.025 , for no special reason.
Also, implemented the changes in x-y coordinates, and made the shift fractional, to make it more realistic.

After all these changes, the results for 10,000 agents, over 100,000 iterations look as

Exponential fit for color1 spread

Exponential fit for color 50 spread

2-D Spatial Cluster( with fractional shift)

Interestingly, now c1 and c50 both start at a similar volume (248 and 258 respectively), but still the trend remains same.

Now that the mod is done, I will carry on and implement the cost function. So Long.

Sunday, April 13, 2014

Day 28 : competing trends!

Since last update, I have been working on a simulation to show how competing trends would spread and often replace each-other among the population.

With a few assumptions, I get the following results.

Let 2 colors (competing with each other for the same attribute) diffuse in the network. (I picked color 1 and color 50)
Give color 1 a head start (500 timestamps, then color 50 also starts to diffuse, for total of 10,000 time steps)
Track the diffusion for both.

This algorithm results in the following output

We can see here that c1 takes a lead over c50, and they both exhibit exponential nature of change.

Next step, I will introduce a cost function for the change, and see what happens. So Long.

Friday, April 4, 2014

Day 27 : Tracking!

Finally, now the whole simulation works as follows:

Assign different color (between 1 to 50) uniformly to all agents.
Randomly pick 1 color track.
Let that color diffuse in the network, and track it's progress as follows :

t = time step, v = volume, i.e., number of members in that cluster. we keep track of [t, v]
we plot t vs. v in the end.
There are two modes of diffusion. They are :

Free diffusion. In this case, if the edge is selected, diffusion takes place.
Constrained diffusion. In this case, diffusion only happens if the distance in "trust" or "belief" attribute of both nodes is less than 2 (belief value varies from 1 to 5).

The output for Free diffusion looks like this.

And the output for constrained diffusion looks like

The significant reduction in the number of conversion in case of constrained diffusion is clear, however, both cases exhibit the nature of an exponential growth, which is interesting.

I shall play with this new model in more detail over the weekend. So Long.

Wednesday, April 2, 2014

Day 26 : Attribute clusters and trust values.

So the next step was to assign attributes 'color' and 'trust' to each node. The 'color' attribute ranges from 1 to 50 , while 'trust' ranges between 0 and 1.
And for n-iterations, randomly pick a each edge, and based on the difference between their 'belief' attributes, the 'color' attribute of one vertex node influences the attribute of the other vertex.

The result on 10,000 nodes, wit 4 average connections each, over 100,00 iteration are shown below.

The first figure is for model 1, where left side sows the initial clustering, and right shows the final clusters.

And for model 2, it looks like

The same graphs, if the "trust" property is not used, would respectively look as

and

Next, I am working on more simulations, and generalising the "trust" attribute to be a composite of different attributes.

So Long.

Friday, March 28, 2014

Day 25 : Graphs

So I scaled the graphs for upto 50,000 agents, and they look like

for BA model, and like

for the modified model.

A sample graph, for 30 agents, look like this

Trying out some neighboring effects on this graph, we see the following results.

The neighbours of "Agent 5" show up in blue.

And then neighbours of "Agent 1" show up in pink.

It's interesting to see the transition of "Agent 5" itself.

However, still a lot of work needs to be done to map out the collective effects on the non-hub nodes. So Long.

Tuesday, March 25, 2014

Day 24 : Modifiers!

So today, I tested out the 'Like Mindedness' attribute as a modifier for the connectivity probability on top of the basic BA model.
The result for 1000 and 10,000 agents respectively are as follows :

1. 1000 agents, with average connectivity of 4, under the basic BA model

2. same model with modifier

We can see the decrease in the highest peak value.

3. 10,000 agents, with average connectivity of 4, under the basic BA model

4. same model with modifier

The change in behaviour seems consistent even with the increase in the number of agents.

Next up, I scale it for even larger models, and try out composite modifiers. So Long.

Monday, March 24, 2014

Day 23 : Progress!

Finally, having rested up and consumed loads of Indian food in Hungary, I am back!
This week, I started with a fresh approach to create a BA network, based on an edge-list view instead of going for separate classes, and the output looks like this for 1000 nodes, when seeded with a fully-connected network of 5 agents.

The same network, scaled for a 10,000 agents, now takes less than a minute and looks like this

In both cases, the average degree goes to 4.

Now, I'll link this model to the existing class structure and see how using the belief values or 'likemindedness' affects the connections.

So Long.

Wednesday, March 19, 2014

Day 22 : Graphs!

I have been gone for a while. This duration, I made a lot of changes, and now the result I have currently looks like the following.

Few days ago, it started getting in shape, a basic, 10-node network based on 3-node seed network looked like this.

Then, increasing the number of nodes to 500 resulted in this.

Clearly, it's lacking in the number of connections being formed, as opposed to normal PA models tht do not have any connection forming criteria. However, I tried to increase the number of connections, and got something like this for the same number of nodes

The sudden shift and spike in number of connections is clear, but the second peak is against the nature, so I reverted back, and tried a different approach. While the above model used random number of connections for the incoming nodes, I tried with (seed/2) number of connections instead. And it yielded in the following result for 150 nodes,

And for 700 nodes, it looked like

Now, again, I will be away for the weekend, but the new tests should yield results when I resume with the simulations next week.

So Long.

Monday, March 10, 2014

Day 21 : Video!

So, since I still have a hard time trying to explain to people what is it that I am exactly trying to achieve, (World Domination! Isn't that obvious??!! ) I stumbled across this amazing video by Hannah Fry , explaining all sorts of things like hub centres, preferential attachment, and even degrees of connections to some extent.
Go have fun!

Thursday, March 6, 2014

Day 20 : New Approach.

So, Mathematica seems easy to handle, and this change comes with a new approach.

Step 1. Create a Network with fixed number of nodes and connections.
Step 2. Decide on a probability value.
Step 3. Based on that probability, randomly rewire the connections.

This generates something like this for 500 agents, and 1000 connections total.

Advantages :
1. Faster Execution
2. Lesser LOC.

Disadvantages :
Working on it.
So Long.

Friday, February 28, 2014

Day 19 : The Giants!

Since I have been in bed most of the week, I finally put the kindle to good use and took up to reading.
Here, I'll try to put together a list of the various research I have read about so far, just to summarize the magnitude and depth of the rabbit hole. Here Goes

Barabási–Albert model
This was the starting point for me, and quickly led me to the next checkpoint.
Scale-free network and Power-law decay
Understanding these helped in gaining a better view of the general idea behind the B-A model, and I can't say I wasn't surprised by the absence of the bell curve for a change.
Watts and Strogatz model and Small-world networks
These models help generalise the idea a bit more, and now I realise what exactly is the difference in the scale-free network I aimed for, and the fat-tailed distribution I achieved.
The achieved network classifies as small-world, while the targeted one should be a ultra-small world.

Extra Read :

1. Structure and Function of complex Networks - Newman

2.Scale-Free Networks are Ultrasmall - Cohen, Havlin

3.Introverts and Extroverts - Max Freyd

4. Biomarkers to detect All-Cause Mortality

Now time to go back to more scrambling, also, I started to look into Mathematica. It's fun!

So Long.

p.s. I'll keep coming back to add more useful and notable reads to this post.

Friday, February 21, 2014

Day 18 : Personality profiles

So it turns out our curve is but a part of a fat-tailed distribution, and this tail part very conveniently follows a power-law decay, which led me to believe free scaling was achieved. Alas!
After discussing this issue with my advisor, and several meetings with Maria, a friend from the behavioural science section, I am looking into personality profiles now, to modify the inherent attribute of the agents, with a focus on how people tend to be Introverts/Extroverts.
All the reading is keeping me away from coding for a while, but it is a fascinating area!
Hopefully I'll work up a base and modify the code by the middle of next week. So Long.

Monday, February 17, 2014

Day 17 : I Can Haz Plot!

After spending a week tweaking and tinkering, it finally looks better. I have the graphs for the frequency of of the agents with which they make connection. In simpler terms, this is what we get if we list all the values for NumOfConnections for the agents, and plot every unique value in this list against it's frequency in the list.

For a total (seeded as well as dynamically introduced) of 250 agents, the plot looks like

and for 500 agents, it looks like

Although not much clear yet, the constraint shows gaps, as I only let the agents connect for a strong probability (over 0.5) after modification with the belief values. Relaxing this criteria should fill up the gaps and give a more uniform graph, however, we can still see the hyperbolic nature of the graph.

Next To Do :

Try for clearer presentation
Code Cleanup - need to move the "makeConnection" module to a different script to remove redundant code.

Also, now it's time to get in touch with my advisor and treat myself to cake, maybe.

So Long.

Wednesday, February 12, 2014

Day 16 & a -half : Probability Modifiers Ahoy!

So, the additive probability modifiers are up.
They take the old BA- probability and modify them based on the distance in the belief values of the two agents in question.
Now I'll spend the next two days rechecking everything and trying to get some plots (stuck between histograms and scatter plots for now) to check if the network scales freely.
So Long.

Interesting addition : This post was originally from yesterday, but then I realised I was wrong in normalising the modified probabilities, so I went ahead and fixed it.

Sunday, February 9, 2014

Day 15 : B-A model

So finally the B-A model is up and running.
It can be seeded with N-agents, which make k -number of strong connections based on the distance in their belief factors, and then the user can dynamically introduce new agents to the scenario which form connections based on a popularity probability.

Now the work in progress aims to modify this probability for preferential attachment to account for the belief factors. So Long.

Thursday, February 6, 2014

Day 14 : Back to Hack!

After spending a couple days designing and re-thinking what we have so far, and a meeting with the adviser, it's time to get coding again.
So, we already had enough features to generate a seed network, on top of that, now I have added the feature to dynamically add a new agent, keeping in peace with the concept of Continuous Growth mentioned earlier.
Then, for a change of flavours, I plotted the numeric behaviour of the random-connected weak links, and for 150-500 agents, it's not really clear if they follow any pattern, and hence, no pretty graphs in today's post.

It's pretty late, and the bed is irresistible now. Implementing the rest of probability calculation will have to wait for another night. So Long.

Wednesday, February 5, 2014

Design 1 : Randomness Prevails

Having a seed network ready, I spent the past day and a half trying to mathematize (yeah, I just made it up) the "magic" mentioned previously.
After looking into logarithms, log-log, and logs of priors, (sudden obsession with logs, it seems, but they are kinda awesome), and some statistical pruning methods over the existing connections, it seems nothing makes too much sense, and hence I decided to go with the closest thing to magic, random functions.
So tonight, I'll start implementing the same. So Long.

In other news, Naiad goes into water tomorrow and can't wait for it. :)

Monday, February 3, 2014

Theory 1 : CGPA - Continuous Growth and Preferential Attachment

YAY! Theory time!! Okay, maybe not the best thing in the world, but important none the less.

As shown by Barabasi and team, real scenarios tend to follow a pattern for connection between nodes. But for all it's glory and charm, Math can never assure certainty, the best it can do is give you a probability for a connection to be formed.

and then, with some magic, some connections are established

Something similar to this aforementioned "magic" is portrayed in the new series, Betas , nice timing, eh?!!
While one may argue that human connection is ultimately attributed to this 'magic' factor (others might use words like Charm, affection, courage, luck etc. ) , what math does is give us a tool to better understand what is happening, and peek into the how- and why- of it.

The beauty of the model lies in the fact that it lets new nodes be a part of the network. "I don't need new friends, I have enough." - said no one ever. This is what 'Continuous Growth' refers to.

Also, the part of the model about 'Preferential Attachment' says that heavily connected nodes are more likely to form new connections with an incoming node. In terms of a common man, it's like saying everyone at school knows the captain of the football team.

Keeping these two facts in mind, we'll focus on how hubs are formed around heavily connected nodes.

So now, I need to modify my existing model to act as a seed network and create a bigger network accounting for these two properties. Time to unleash the code monkey inside. So Long.

media courtesy : wikimedia.

Saturday, February 1, 2014

Reading 1 : H2H and communication

Found a really interesting article about H2H. It gives some ideas about the communication. Though it's still in design phase.

Friday, January 31, 2014

Day 10 : Connection Counter is ticking!!

A few hours ago, finally got the connection counter to work without the need of having to use 'global', and shifted the connection modifiers to strong signals for now. However, this proves the possible modularity of the model. Over the weekend, I'll port it out some way (maybe a wrapper ?) to actually the implement aforementioned modularity and make it easily accessible.

Now I'm going to treat myself, consume a relatively large amount of candies and watch 'A Beautiful Mind' tor the N-th time. So Long.

p.s. - after working on Naiad, I can see a difference in myself where I actually care also about the quality of the code.

Day 9 : different connection -lists

Yesterday, I separated out the connections in two categories, and kept playing with it for a while.
While it still doesn't say much, but it's interesting to look at the connections, and it actually makes sense that some agents, while not densely connected, are critical for connecting two parts of the network (eg., Kite- graph) .

However, a new impediment arose. "Global" is giving me a hard-time,and the next day or two shall be spent fixing that. I'm also thinking about taking a break, and studying some Swedish over the weekend, for apparently no reason at all. So Long.

Wednesday, January 29, 2014

Day 8 : setter/getter removed

Finally, the time was up, so I decided to go for an alternate approach and got rid of the setter/getters. Now the connection-counter can be modified through an external script, although only unitary changes at a time are possible, and i intend to keep it that way since it should be useful when we get around to computing ranks.
Not much else done today. tomorrow, I should start changing some logic in the distance, and probably make (at least) two categories of connections, weak and strong. So Long.

p.s. - made some graphs this morning to see how "rand" is working. This output shows 100 agent-nodes in a scatter histogram.

Tuesday, January 28, 2014

Day 7 : 2-D distance metric, issues with Setters

As planned previously, I moved the connection generation from random to a 2-dimensional distance based on belief values of the nodes.

On the other front, a bit more research gave me the idea of "mimic" behaviour amongst nodes, but it needs a proper formulation, and it is still further down the road. What I need to do immediately is focus more on the implementation. It's irritating that it's been over 24 hours and I could not get the setter-getter to work as I want it to.
Decision time, 24 hours more, then I switch to alternate approach. In the mean time, code files have unresolved dependency for the "Main" control module, so they are not yet updated. So Long.

Monday, January 27, 2014

Day 6 : Connections

This morning, I went around and implemented a separate view level for the connections, and through a separate function, the fate for every possible connection is randomly chosen (using Matlab's inbuilt "randi"), going through a complexity of (n*n!).

Although not optimised, it's the most elegant model I could think of so far, as it gives a fair chance to every possible connection, without having to face the hassle of repetitions. As usual, the code is updated in the repository.

Next Up:

Increment/Decrement

Change the Degree of the agent node (attribute - NumOfConnections) when a connection is made or severed.
For this, I am currently pondering on how to connect the two views. I will use setter-getters for the Node properties, it's a well accepted method and needs no advocacy. I'll try to have that as a part of the script that generates the connections, or invoke both from a separate script.

So Long.

Saturday, January 25, 2014

Day 5 : Matlab Begins

I started porting the network to Matlab, and after doing some reading, I decided to go with classes and separate *.m functions for EVERYTHING. Not sure if this much modularity will be needed, but I hate working with long, undocumented code.

I created a basic class for agent nodes, having 2 belief values, and 1 enterprise membership, and also provided a method to generate and initialise the nodes. Each node randomly gets assigned one of the four enterprises, and two random values for belief.

So far, everything has been updated in the repository.

Next step : Assign connections.

Option 1.

From a node-view. i.e., for EachAgent, randomly pick/assign number of connections, and generate those unique connections. but this screws up for bi-directional connections, which is important for my purpose, otherwise, the whole network would be like a big group of people stalking each other. Which, while interesting, is a bit creepy, and let's say is "Out of Scope" for now.

Option 2.

From a connection-view, i.e., from a combination of random n-square (non-unique) possibilities for n-agent nodes, generate the unique connections. This seems the way to go. However, I should think a bit more about it.

So Long.

p.s. - While at the Spring Premiere at the Student Union last Friday, it occurred to me that it is not the best way to go analysing human networks without accounting for the social events, where the interactions between the nodes are much more likely to happen, and also to be different.

Somehow, I should try to simulate this effect based on some Events. Something to think about and discuss with the mentor. Plus, "Simulated drunk people at a party" always looks good on the resume!!

Thursday, January 23, 2014

Day 4 : Belief achieved, so to speak.

This morning, or rather afternoon, while waiting impatiently for my lunch, I realised that I should not go with random distance values. So I initiated the nodes with belief values and computed the distance between adjacent nodes on a One-dimensional scale as difference between the belief values (b1). It's still crude, but a sample output for 100 nodes looks like this.

Next Step : Move distance computation to Two-dimensions.

So Long.

Wednesday, January 22, 2014

Day 3 : Away from Euler, in the pits of Networking

As predicted, I have been away from the plan, and Euler Problems. Mostly been busy sleeping and catching up with the thesis work.
So I have decided to switch, and try to optimise the code for thesis as I go on, and use this blog as a running documentation for the same.

After spending 3 days fixing EPD, everything finally works, and I have decided to write prototypes for testing my hypotheses in Python, small, notebook-style scripts, and then port them to Matlab to suit the requirements for School.
I started out by making a basic list, using Evernote, obviously, and it looks like this.
To start out, I created a basic weighted node, and I realise I'll need an attribute "weight" (to act as distance) for each connection. As initially thought, I could just use the difference in the "belief values" for this weight, but I'd like to see what happens with random weights.

The Basic Network with 100 nodes and 200 random edges looks like this. The dotted lines are less than 0.5 strength while the dark ones are stronger than 0.5.

Interesting Observation :- It's funny that I am writing in a "Day X" format, because mostly these entries are made late at night. The Irony!!

Saturday, January 18, 2014

Euler : Day 2

As expected, it's weekend, and productivity went flying out the window. However, I did solve Problem 6 and 7 before signing off for the night.
Interesting observation : Time complexity went down from 2.04 sec to 0.6 s when switched from while loop to use of range function for a basic level primality test.

Now it's time to go meet the new arrivals today, let's see if I can squeeze in a few solutions as well. So Long.

Thursday, January 16, 2014

Euler : Day 1

So I solved Problem 1 - 5 today. Choice of environment was obvious for me, Python within EPD on Win7.

Interesting observations, after the second problem, I decided to check for execution times, and basic clock stamps satisfy me. I should try and keep up with the optimisation practises to bring these times down much as possible.

However, an ethical dilemma arises regarding problem 5. I directly did the math in my head to start with a product value for all primes less than 20. It is not wrong, but to have a more generic code, I should leave it to the program to find out the primes and calculate the number. It's not significant for an upper limit of 20, but that will definitely change as this limit goes up.
Presently, Execution time is 0.06 ms, and can't say I am unhappy with that.

Oh yes, I am trying not to use any libraries, to keep stuff minimalistic, for as long as I can. Let's see how long I can keep that up. So Long.

Ready, Set, Go

As the last semester of school approaches (yes, I still have this weekend to enjoy life as I know it), there has been a feeling somewhere in the depths of my tummy, STUFF NEEDS CHANGE.
Charmed by the society going on and blissfully keeping up with their resolutions, I went ahead and made a list. It looks like this :

Pin-up Notice Board : CHECK
Glass Board and Markers : pending, out of budget
Comfortable Chair : CHECK
Proper, Clean work area away from the bed : CHECK
Proper Schedule : pending, awaiting official instructions from school
Eating Habits (less coffee, more not-coffee) : pending, awaiting divine intervention

So, after having acquired the props, another week went by while I continued to live happily. Finally, keeping up with the traditional rocket-principal, and after consuming whole lot of chocolate and caffeine, some inspiration and structure showed up last night.

The original, planned notice board was nicely hanging above the bed, then a friend decided to give me another one, which is currently somewhere under the bed for the lack of availability of a nail and hammer. But the chair is unexpectedly awesome, an amazing (belated) Christmas gift from some friends.

Stuff happened, to keep practising basic skill, I took up to Project Euler . Presently, since I have some time, it should be doable, as the self-imposed limit is 5 questions a day, which I will document here, and also at this repository.

Not pushing things too far, I realise this routine will be tough to follow on weekends, but it should help redirect the time spent on the computer from shows to code practice. But as always with life, we shall see.