Thursday 15 December 2016

Documentum D2 4.7 is here

Documentum D2 4.7 is here with enhanced features-


1. Introduction of Java free client (Documentum Client Manager)..
2. Removal of D2 Lockbox  utility by D2Keystore
3. Property page advance view editor functionality is back..
4. Installers are available as Docker image as well..
and many more.... :-)

Tuesday 29 November 2016

cURL it !!

curl

In this post I would like to highlight how you can setup and use one of the powerful tool curl in windows machine. Out of many features, I found curl very useful while testing rest/soap services. 

cURL is a command line tool for getting or sending files using URL syntax.
It supports a range of common Internet protocols, currently including HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, LDAP, DAP, DICT, TELNET etc.

You can download the software from below location:


Installation is very easy in windows.

1.       Extract the zip file and keep it in a location in your computer.
2.       Go to http://curl.haxx.se/docs/caextract.html and download the digital certificate file named ca-bundle.crt.

Machine generated alternative text:
Computer 
Local Disk REST 
curl 
New folder 
Include in library 
Name 
Share with 
mloads 
curl.exe 
curl-ca-bundle.crt


3.       Add the curl folder path to your Windows PATH environment variable so that the curl command is available from any location at the command prompt.

Testing with curl command

I mainly used this during some activity related to salesforce. So example urls can be different in your case. Now you are ready to get your hands dirty ;-)

a.       Open command prompt and type :- 

b.      In proxy enabled system ,you may get Timed out issue.

c.       Curl support proxy so you can type the command above like below

                                    i.     curl https://ap1.salesforce.com/services/data -H "X-PrettyPrint:1" --proxy <proxy_name>:<port> --proxy-user <proxy_user>:<proxy_user_credential>

                                     ii.            In above command -H "X-PrettyPrint:1" is added to get the output in formatted way. You can test the output without that part to get the clear idea.


d.      You may face the certification issue while executing the above command. In development environment you can skip this part by adding -k after curl command. For example-

                                                 i.   curl -k https://ap1.salesforce.com/services/data -H "X-PrettyPrint:1" --proxy <proxy_name>:<port> --proxy-user <proxy_user>:<proxy_user_credential>


e.      For help or other usage you can simply type the command curl --help


Hope this will be helpful... :-)



Note: Blogs which I have written are totally my personal view based on my personal experience.

Monday 28 November 2016

Why D2?

Why D2? (Old Draft version)

In my earlier blog, I have tried to discuss some out of the box features of D2. Now in this blog, I would like to touch base why customer should choose D2 as their content management system.
With the introduction of D2 4.5 and 4.6 recently EMC has given us one of the strongest product suites to manage business critical requirements and integration solution.
In D2 4.6 some major performance and security improvements has been done. They have improved the installation module as well but still there is scope of improvement.  
There are many reasons why you should choose D2 as your Content Management Solution. Below are few scenarios which helps you to take the decision.

1.      D2 is all about configuration. D2 provides more personalized user experience because of its configuration capabilities which in turn helps to

-      Increase productivity
-      Reduce the time of implementation and go live
-      Decrease the cost of end user training.
-      Less maintenance & upgrade effort
-      Easier to manage last minute requirement changes from the business.
        The configuration options of D2 are very powerful but there may be specific needs from customer which can be addressed using D2 customization options as well.


2.     Document management requirements like creation/import, edit, check in &  check out, version control, auditing are available as out of the box features and well managed.



3.      One of the important feature i.e. annotation which is required by almost every business divisions so that they can collaborate within the team. D2 has unique solution for that as well. D2 has out of the box feature which support annotation in native format. PDF annotation can be done with the help of third party tools. For example OpenAnnotate or ARender or BRAVA.


4.      D2 has a unique feature of comparing documents of different versions which help users or auditors to track changes done at different version level.


5.      One of the most interesting feature of Documentum D2 is the external widget which can be used to integrate any other application in the organization or showing dashboard. Overall it opens up the way to integrate other platforms with your DMS system.


6.        Searching is a critical process for every domain. D2 has outstanding search capabilities. There are many ways to provide search capabilities to end user like


o   Simple search
o   Advanced search
o   Query Form
o   Facet search.

7.     Folder creation based on attribute values using only configuration.

8.     PDF watermarking using C2.

9.     Introduction of new D2 REST API for integrations.

10.    D2 Migration Utility is newly introduced from D2 4.6 onwards. The migration utility performs deletion and recreation of data in the repository.
Many configurations are protected with ACLs now but it leads to a change in the object model. This process is irreversible and it is highly recommended to take backup of the content server and the database before migration. If you have many objects in your existing D2 tables then this process can take longer time than expected.

11.       The ability to restore session to the last browsed location in case of session time-out, logout is one of the feature which I like the most.

Sunday 3 April 2016

Journey towards Machine Learning using R - part 1

"What is Machine Learning?"


Initially, when I have started googling / reading about Machine Learning, I felt like various rockets bombarding on me :-). Machine learning is a vast area and it is quite beyond the scope of this post to cover all its features.

There are many definitions available in the web and in the simplest form we can say "Machine learning refers to the techniques for recognizing and understanding the vast data and making wise decisions based on the data by developing algorithms."

There are several ways to implement machine learning techniques. Broadly used are 



For example Recommendation is also popular technique that provides close recommendations based on user’s previous purchases, clicks, and ratings.

Apache Mahout is a classic example. It is an open source project used in producing machine learning algorithms.

There are many open source projects available for producing scalable machine learning algorithms. In this post I will concentrate on basics of R programming.

Installing R on Machine

The easiest way to set-up R is by downloading a copy of it from here  and the IDE RStudio from here  , which makes R coding much easier and faster.

After successful installation of R you can launch the GUI console. For reference please find below one sample snapshot





Understanding R

Snap shot from RStudio




As you can see from the snap for variable assignment we can use <- or = or ->
# is used for commenting

Data structure

Selecting a data structure to hold data is an important task. In R, the data source can include text files, spreadsheets, statistical packages and database etc.

R contains wide variety of structures for holding data including scalars, vectors, arrays, data frames and lists. Unlike java, variables are not required to declare as data type.
We can get to know about the data type using below command
> flag <- TRUE
> print(class(flag))
[1] "logical"

Vectors

 Vectors are one dimensional arrays. Combine function c() is used to form the vector.
> a<- c(11,21,31,41,51)
> print(a)
[1] 11 21 31 41 51
> a[3]
[1] 31
> a[2:4]p
[1] 21 31 41
Note: Scalars are one element vector.

Matrices

A matrix is a two dimensional array where each element has the same type like numeric, character or logical.
> rownames<-c("Row1","Row2","Row3","Row4","Row5")
> colnames<-c("Column1","Column2","Column3","Column4")
> X<-matrix(1:20,nrow=5,ncol=4,byrow=TRUE,dimnames=list(rownames,colnames))
> x
Error: object 'x' not found
> print(x)
Error in print(x) : object 'x' not found

Please note variables are case sensitives which causes the error in RED.

> X
     Column1 Column2 Column3 Column4
Row1       1       2       3       4
Row2       5       6       7       8
Row3       9      10      11      12
Row4      13      14      15      16
Row5      17      18      19      20

dimnames: is used for labels. Optional.


Arrays
Arrays are similar to matrices and can have more than 2 dimensions.
> X<-array(1:20,c(2,3,4))
> X
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

, , 4

     [,1] [,2] [,3]
[1,]   19    1    3
[2,]   20    2    4



Data frames

Data frames are mostly used data structure in R. It can contain different modes of data like numeric, character etc. But one point to remember that each column must have only one mode.
> studentID<- c(101,102,103,104)
> age<-c(25,24,26,25)
> grade<-c("good","poor","improved","excellent")
> score<-c(70,45,60,90)
> studentDetails<-data.frame(studentID,age,grade,score)
> studentDetails
  studentID age     grade score
1       101  25      good    70
2       102  24      poor    45
3       103  26  improved    60
4       104  25 excellent    90
> studentDetails[1:3]
  studentID age     grade
1       101  25      good
2       102  24      poor
3       103  26  improved
4       104  25 excellent

> studentDetails$score
[1] 70 45 60 90
> studentDetails[c("studentID","score")]
  studentID score
1       101    70
2       102    45
3       103    60
4       104    90
> table(studentDetails$score,studentDetails$grade)
   
     excellent good improved poor
  45         0    0        0    1
  60         0    0        1    0
  70         0    1        0    0
  90         1    0        0    0
> max(studentDetails$score)
[1] 90

Now if we use plot(studentDetails$studentID,studentDetails$score)
Execute plot(studentDetails$studentID,studentDetails$score,type = "o") in R and see the result J .


List

List can gather any kind of objects/ structure we have seen so far.
listExample<- list(obj1,obj2,…)


Importing Data into R

  • edit() function can be used to take input from the user
It's important to store the data in variable otherwise all entered data will be lost. See above image.
  • Import data from text file

If you have R- Studio installed then you can take advantage of the help predictions like below



So to get & set current working directory we can use below commands
> getwd()
[1] "C:/Users/aniket/Documents"
> setwd("E:/tmp/data/")
> getwd()
[1] "E:/tmp/data"
 
Its important to set the current directory to the location of the file system which you want to read. 
To read a file in table format and creates a data frame from it we can use below options and then we can manipulate the data same way like data.frames

  • tableData<-read.table("sample_data4.txt",header=TRUE,sep=",")

  • tableData<-read.delim("sample_data3.txt",header=TRUE,sep=",")


Also we can use the option available in RStudio i.e. Tools->Import Dataset->From Local File…

Note: To get help for any command you can use like help("read.delim2")



We can read and manipulate data from csv, xslx file formats as well. Sometime we may have to install new packages to do this kind of activities.



Working with R Packages

To see all the available packages you can use library() function.

To install a new package we can use install.packages("Name of the package")




or we can use the option in RStudio i.e.


Tools->Install Packages…

To load installed package you can simply use library("package name")

  • One interesting package I came across which gives you the power to manipulate data frames using SQL as well.

> library("sqldf")
Loading required package: gsubfn
Loading required package: proto
Loading required package: RSQLite
Loading required package: DBI
> studentID<- c(101,102,103,104)
> age<-c(25,24,26,25)
> grade<-c("good","poor","improved","excellent")
> score<-c(70,45,60,90)
> studentDetails<-data.frame(studentID,age,grade,score)
> QueryData<-sqldf("select * from studentDetails where studentId=101",row.names=TRUE)
Loading required package: tcltk
> QueryData
studentID age grade score
1 101 25 good 70


R Code Sample

R syntax is different but if you have good grasp on any languages like JAVA then it will not take time to take a grip on R basic syntax like conditions, loop,functions etc. Below are some use of R sample code which can be useful.

> new.function <- function(a) { # defining new function
+ if(a%in%8:12){ # checks whether a is exist between 8 to 12
+ for(i in 1:a) { # for loop will iterate till 1 to value of a
+ if(i==3){

+ next # used same as continue

+ }
+ else{
+ b <- i^2
+ print(b)
+ }
+ }
+ }
+ }
> new.function(9) # call the new function
[1] 1
[1] 4
[1] 16
[1] 25
[1] 36
[1] 49
[1] 64
[1] 81

Almost every sectors like Retail, Healthcare & Life sciences, Banking etc. can leverage the benefits of Machine Learning. But we need to identify/understand where cxactly we can maximize the benefits out of it. 

In  ECM space we can use these techniques to provide better insight of audit trail data to the end users or auditors.

Keep me posted your valuable thoughts and happy learning ;-).

Make life easier — Git automation with single command file

Make life easier — Git automation with single command file Posted on medium #makelifeeasier series - Automation of git related activity...