Sunday 2 October 2011

Setting up a Data Repository

Data Repository is a database (of source codes, content files) which is used to analyze, store and update data. It provides several features like diff, commit, etc to help in improving the quality of the software. The primary role of the repository is that as everyone can keep updating the data, the software is transcended into a new version with every update, and the repository manages the integration of the update and transformation of the present repository with the current development (version control)It provides us only one single interface to use several applications.

Analogy in Layman terms why to use Repository in place of just accessing each file individually?

It’s like collating between getting a book from an author or going to a library to get it. With the use of Repository the author can keep his product (book) in the library (repository) which gives it a better chance getting out to people. Also provide a simpler and uniform way of installing.

(Commit command-The command means to change or update the repository with a specific action mentioned in the commit.)

Advantages of Data Repository-

1-It allows different developers at different places to write code in the same repository and update simultaneously. This saves a lot of time and money, as the code can be written remotely and also fast access.

2- Atomic Commits -anyone can try some changes (fixing some problem) in the data repository without hampering in the actual repository by getting a local repository in which the person can make the changes, the user can publish the change to be made in the parent repository only if it is necessary. Also everyone can test all files of the repository this improves the standard of the software for sure.

3- Very efficient for the complex merging (of codes) or joining of data from multiple sources.

4- The data is only allowed to update not delete any data (code) this ensures the feature of Rollback, if required at any stage,

5- Difference between versions (before commit) or difference between two specific nodes in the data repository can be found using the diff command.

6- Data repository normally has a tree structure which is very helpful in analyzing- nodes at same level will be similar, parent commit of any commit will be a level just above and it also helps in tracking the path.

7- Queries can be handled more efficiently as all the data is well integrated in the repository.

8- Rollback- the user can roll back to a version that was before making a commit, by tracking backwards in the data (tree structure).

9- Several other features like back up optimize use of resources, recovery, generating reports from the repository, security, history, importing data.

(SVN and GIT are the two most common Data repository system used for version control systems.)

SVN vs. GIT


1-GIT doesn't record any explicit renaming as apposed to SVN.

2-GIT does allow the duplicity of the source code unlike SVN.

3-In GIT there is only one top directory - .git , while in SVN which uses .svn folder in each directory.

4-Branching and tags(marking a commit with details) is easier in GIT over SVN

5-GIT doesn't allow to download one single folder from the data repository ,but SVN does.

6-For naming a version GIT uses SHA1 algorithm which outputs a 40 character long hexadecimal string while SVN uses the normal decimal system.

7-GIT doesn't have a centralized server -GIT allow all the developer to branch the project and store it in as a local repository ,and the developer can work on it even offline.

8-GIT repository has an efficient memory because the data's file format is compressed, which is not true for SVN.

9-GIT keeps track of the content while SVN keeps record of the files.

Data Repository in Ubuntu

Repository facilitates the user in managing software- like a program needs to be run ,all the packages (which are already added in the repository) that are required to run the application ,are available in the repository, the used does not have to find each package . Also automatic updates.

Data Repository in Windows

For windows repository can manage all the software that has been installed in the windows. Update via repository is way better than by the usual updates made in the windows, due to the simpler structure of the repository. Both command line interface and GUI interface are available for repository. Repository are virus free ,therefore software stored in the repository will not be tampered by a malicious software.

Creating a Data Repository using GIT-

1-Download GIT-download is a single executable file which installs the entire git system.










2-While installing you should check the ”add gui “and “GIT bash here” and “Use get bash only”





















3-Create a directory and then a folder,then right click and choose “get GUI here”










4-Click on Create new repository:-










5-Fill in the path to your new directory and click Create. You will then be presented with the main interface of git gui, which is what will be shown from now on when you right click on your folder and click Git GUI Here
















Adding the details of the usre by edit->options











Enter a c file in the directory where you will find .gui directory and then press re-scan in git. You will find the files are appeared in the left box.











Now to add file to commit click on the left side of the the file ,










Write the comment you want to make for the commit. Make changes in the file and then click commit. And then see the file has been committed .